What you get:
10 ideas for machine learning projects.
Free datasets to use for your projects.
Examples of libraries and algorithms for each case.
Git is a distributed version control software that is used for a single developer or a group of developers to contribute to software projects. Its popularity grew thanks to the development of GitHub, founded in 2009 as a free cloud repository for open source projects. As of April 2017, GitHub reports having 57 million repositories, making it the largest host of source code in the world.
Believe it or not, git was created in 2005 by the same individual who created Linux, Linus Torvalds.
If you don't think that is a good reason to use git, here you have another ten.
As of 2017, there are still people developing code on their local machine. Do these people make backups in a USB? Do they share their code with coworkers via email or Whatsapp? Having your code in the cloud has many advantages, it is generally more secure, especially if it is hosted on GitHub where they have professional security. It is easier to work collaboratively, any team member can download the latest version of the repository from any machine. It is cheaper than a traditional server; in fact, GitHub allows to host open-source projects for free and BitBucket allows to do the same for private repos.
Git is distributed, meaning that every local copy of the global repository is a fully working copy. In case there is a problem with the server and the global repository is corrupted or lost, any local copy can recreate the full history. This file system architecture is the opposite of centralized file systems like Subversion (SVN), very popular before GitHub. In a centralized version control system, the global server contains all changes in the project and the local copies are just light versions of it. If the server goes down, you lose all the history.
Git is designed for creating projects where many contributors develop software in parallel. Specifically, it has a very powerful way to resolve conflicts when two people are working on the same file. Its stability is much higher than SVN. Those who have used SVN knows what I'm talking about.
Git has been around for many years now and it's really easy to find good documentation. I have my own tutorial, where I explain the basics of git and add useful commands that I use from time to time.
Nobody can say git is difficult. To do the day-to-day work, you just need to manage these 8 commands: git clone, git status, git add, git commit, git push, git pull, git checkout and git branch. Their explanation can be found here. With them you can download a repo, check its status, add files, send files to the server, retrieve the latest changes on the server and create branches to work in parallel.
Branches are one of the best features of a version control software and are used to develop in parallel to the main repository. A branch is a fork of the main code to develop a new feature. When you create new functionality in the code, you should create a branch. Then you develop the functionality, test it and when it works perfectly, you can integrate it or merge it into the main branch (which is usually the master branch). When using branches, you can have simultaneous versions of the same code.
Code reviews are a good practice that every developer team should follow. Git facilitates code reviews with an operation called pull request. A pull request is a merge operation between two branches, one of them is typically the master branch. The standard workflow is the following: a developer works in a branch coding a new feature, when the feature is finished a pull request is submitted. A different developer reviews the code of the pull request and suggests improvements. Once the improvements are made, the branch is merged into master, and thus a new feature is added to the project.
Every time you send code to the server you make a commit. Every commit is referenced with a hash that uniquely identifies it, see this example. With git, it is super easy to revert to any past commit and fix a mistake.
If you want to develop open-source code, the biggest repository is GitHub. Here you can find the most popular repositories in Github, it includes bootstrap, react, d3, tensorflow, angular, etc.
Git popularity generated the emergence of other services that complete the software development stack, sometimes called git hooks or git integrations. An interesting case is continuous integration systems like Travis. These systems allow for automatic testing of a software solution. Another useful tool is Waffle. This system helps developers to plan and track a project linked directly with GitHub. Finally, another interesting tool is Codacy that generates automatic code reviews.
These are 10 reasons to use git, if you have more, please comment!