Working on the Exchange Analyzer project has taught me a few things about using Git, in much the same way that stepping off the end of a jetty teaches you a few things about water. Before we started that project I was stashing my PowerShell scripts in a Dropbox folder so that they would conveniently sync between my different computers, and doing Stone Age version control by copying whole files and naming them “Script.ps1.old” and “Script.ps1.old2”.
I know right?
Anyway, one day a friend showed me his Github repositories while demonstrating a Rails deployment, and I decided to give it a try as a place to host my PowerShell code. Using Github has some advantages over my old system:
- It's easy to share code. Github has public and private repositories. I put most of my code in public repositories. I can point people at the latest code easily, without having to make changes in my old Dropbox system and then upload a file somewhere. As a side note, I do upload scripts to the TechNet gallery as well, since that is where a lot of IT pros look. It's just part of my workflow now, so it's not inconvenient.
- It's easy to get feedback. People can raise bug reports and feedback requests using Github Issues. I like fixing bugs if it makes the script more useful to the community.
- It's easy to get contributions. People can submit patches for your scripts that fix bugs or change behavior. This has its ups and downs, as you'd expect.
- It's a better way to write code. You can track changes, roll them back, use branches to tinker with features, do proper releases, and so on. I'm going to explain how I use Git in this blog post.
So now I have a Github account with a sprawl of repositories. It was difficult at first to decide what should be stored in its own repository, and what should be added to an existing repository. I've settled on a basic set of rules for myself:
- A script that is a tool unto itself goes into a repository of its own
- A script that is part of a tool or project, goes into a repository with the other scripts for that tool or project
- If in doubt, chuck it in a general repo and if it later deserves its own repo then move it
Those rules are partly based on looking at what other people do on Github, and partly based on the knowledge that nothing is set in concrete and you can always change things later.
Getting Started with Github
Before I get into how I use Git let me just say that I'm in no way a Git expert, and this is just what I've managed to learn along the way and how I do things today. I fully expect to learn there are better ways of doing something than the way I am doing it today. Hopefully one of the benefits of me writing this blog post is I learn some of those things from others.
I'm also doing everything the graphical way, no command line stuff, because that's how I learned it. Frankly every Git tutorial I read when I was trying to learn this stuff would always dive into Git commands immediately with no explanation of what the commands actually did. So I learned things the point and click way, and that's how I'll share them here.
Sign up to Github
The first step is obviously to get yourself a Github account. They're free if you want to make everything public, or $7/mth for individuals if you want private repositories. If you're nervous about coding in public then fork out the $7/mth for private repos.
Before you start coding take a few minutes to fill in your Github profile and secure your account with multi-factor authentication.
Create a Repository
After you've signed up and logged in to Github you can create your first repository. You'll find the button in your repository list.
Give your repo a name, which is used for the repo's URL, a description, a README, and choose a license. I use the MIT license because it seems right for me, but you can take the time to learn about licenses and choose your own. If you want your code to be used by the public you should license it accordingly. Some organizations simply won't use your code if it's not licensed accordingly. That doesn't cost you anything of course, but it's a shame to think that you might be sharing some great code that people can't use for lack of a license.
After creating the repository you've now got a place to store your code.
Here's a quick tour of what you can see in the Code tab of your repo:
- Branches – your repo has a default branch called “master”. In a moment we'll create another branch called “development”. I recommend those two branches at a minimum, but I'll also show you later how my Git workflow uses multiple branches. Branches are pointers to different snapshots of your code. I just think of a branch as a parallel universe where a different version of my code is stored, and I can switch between them as necessary, making changes in one branch, merging the changes to another branch, and so on.
- New Pull Request – a PR is a way of saying “Hey I've got this change or addition to the code in your branch, please take a look at it and merge it to your branch if you like it.” I'll show you this in action later.
- Files – you can see the README and LICENSE files, and buttons to create or upload files. Don't do anything to the files in master yet, because you shouldn't be working directly in master (this is the Git workflow we'll get into shortly)
- Clone – cloning a repo basically means downloading a copy of it. Before we get into that you'll need to install a Github client.
- Commits – I think of this as the “Apply” button in most applications, i.e. I've made one change, or several changes, and now I want to apply those changes to my branch. Commits can be rolled back, so it's good practice to commit often, capturing small changes as you make them.
- Releases – I think of these as a snapshot of the code in a branch at a specific point in time. When you create a release you get to assign a tag (or version number) and it wraps up your code into a Zip file. It's a great way to maintain a history of releases of your code without having to trawl back through hundreds of individual commits and PRs.
Clone a Repository
I use the Github desktop client to work with my code, so I'll be demonstrating it from this point forward. After installing the client and logging in with your Github credentials, you'll be able to clone your respository.
When you clone a repo it will ask you where you want to store it on your local computer. The first time I did this I instinctively chose a folder within my OneDrive folder, which immediately caused OneDrive to start crashing. I guess there's some conflict between the sync engine and the way Git handles files. Maybe it works fine for some people, maybe it would work fine if I tried in Dropbox instead. Either way, I don't really see the point in storing my Github repos in a folder that syncs to other PCs. If I want a repo on another machine, I can just clone it there as well and sync using Git.
Create a New Branch
At the moment, the repo only has one branch – master. Master should always contain working code that others can take and run, if you develop directly in master you'll break things now and then, which will only cause grief for people trying to run your code (and therefore, grief for you when you get support requests). Instead, you should create a branch for development, which I like to simply call “development”.
In Github desktop click the button to create a new branch from master called development.
Working with the Development Branch
Now let's take a quick look at how you can switch between the different parallel universes of your code using branches. First we need to make a change to the development branch's code so that there's some differences between the two branches. In Github desktop make sure the development branch is selected, and switch to the Changes view.
So far there's no changes. Click the link to open this repository, and an Explorer window will open with your repo's files. Add a new file to the repository. I'm adding a basic PowerShell script that has the following code in it.
Demo Script for Git Workflow Demo
#Here's the original script code
Write-Host "This is a Git workflow demo"
Speaking of which, the tools you use for writing your code are completely up to you. I use the PowerShell ISE because it is quick and easy. On my to-do list is setting up Visual Studio with Github integration. You can use Notepad, the PowerShell ISE, a flavour of Visual Studio, or anything else that you prefer.
Now there's been a change to the branch. Going back to the Github desktop client, the Changes view for the development branch shows the list of changes that occurred, which in this case is one file with multiple lines of code added. Write a commit message and commit the change to the development branch.
Now we've got two branches – master, and development – and there's differences between them. You can see the obvious difference by putting your Github desktop client side by side with the repository's folder in Explorer, and switch between the two branches.
Here's the development branch:
And here's the master branch:
Notice how the list of files visible in Explorer changes just by switching branches in Github desktop?
Let's say that I do some work in the development branch over the course of a few days, making commits for each small change. I can sync the changes back to Github from the desktop client by clicking on the Publish (if it's a new branch) or Sync (if it's an existing branch) button.
Your branches and their code will be visible on the Github website, along with the history of your commits.
If you make any changes to files via the Github website or from another computer where you've cloned the repo, then you can sync the changes back to your original workstation again and keep development.
Merging Branches with Pull Requests
Once you're happy with the state of your code in the development branch, you can merge it to master with a pull request. Pull requests can be initiated from the Github website, or from the desktop client. I tend to do it in the desktop client because that's where I'm working most of the time. Select the development branch, and then click the Pull Request button in the top right of the desktop client. Make sure it says “from development to master”, or whichever direction you want to merge code. If you create a PR and merge to the wrong branch you can always roll it back later, but it pays to avoid that issue by double-checking here first. Add some comments for the pull request and send it.
On the Github website, open your repo and click on the Pull Requests section. There's lots of information displayed here. You can see how many commits are bundled up into the pull request, which gives you a few of every single change that you committed while you were working on your code. You can also see the files that have been changed, which will give you a view of the difference between your development branch and the master branch you're about to merge to. Things like labels, milestones, and assignees can be ignored if you like. I use them with the Exchange Analyzer team, but usually ignore them for my own stuff. That said, if you label your PRs and other things like Issues then you can look at stats over the months and years you work on your code that tell you how much time you're spending on bug fixes vs new features, as one example.
When you're happy with the PR, click the button to merge it.
Congrats, you're a developer! j/k
Git Branching Models
When we started the Exchange Analyzer project, we quickly realised that we needed to learn how to collaborate on the same code base in Github. Managing your own code in a few branches is simple, but throwing more people into the mix starts to get confusing if you don't manage it properly.
I did a little searching and found this article – a successful Git branching model. After some experimentation we now try to follow that model for the project, and it seems to be working well.
For my own little projects I've started to apply the same model, because it works well for individuals as well as teams. Here's a simple example of how I use it. A repo has some working code in master. It has also been forked to create a development branch, and some commits are made to development for various things, and merged back to master.
I decide to add a brand new feature to my script. To keep development of the new feature separate, I fork a new branch off the development branch for my new feature. I work on the new feature branch, making commits along the way. In the meantime, someone reports a bug in my script. I apply a fix in development, test it, and merge it to master so that the stable code base receives the fix. There's an argument here that the fix should have been developed by forking a branch for that fix, then merging it to development instead of committing directly to development, and that's how I'd do it for the Exchange Analyzer project or for a large, complex fix. But for my own individual projects, I'm usually happy to do quick fixes straight in development.
Now the development branch is different from the state it was in when the feature-x branch was first created. To get the same fix applied to the feature-x branch, it can be updated from development in the Github client. This isn't always necessary, the feature branch can continue on its own and merge into development later, but I like to keep feature branches updated with any significant fixes in the development branch.
Finally, when the new feature has been developed, the feature-x branch can be merged into the development branch with a pull request.
Multiple feature branches and other fixes might be developed over time, and eventually the changes can be merged to master when you're happy that they are stable and ready for public usage.
In reality, with multiple contributors forking and merging code for a project, the flow starts to look more like this. A bit chaotic at first glance, but it's a very effective way to collaborate with multiple people on the same project.
Doesn't all this merging totally mess up your files?
No. I admit, I was doubtful at first that merging code between branches was going work, because I don't come from a developer background with lots of experience with source control systems. Working on my own personal projects I've had no merge conflicts at all. Code from multiple branches happily merges together without incident.
For the collaborative projects, some merge conflicts occur. They are reasonably easy to fix with a bit of manual intervention. I also sometimes find it necessary to make a release branch for preparing the code before it is merged to master, just to tidy up a few lines of code here at there.
But overall, Git handles merging changes from multiple branches just fine, as you'll see yourself when you start working with it.
Something that I first learned about with the Exchange Analyzer project, and have started using for personal projects as well, is the concept of releases. Releases are a way of packaging your code so that it can be easily downloaded by other people. Yes, they can download code from your repos by cloning or downloading a zip, or even by grabbing one file at a time. But a release is more convenient. You can write notes about the release, change logs, add other files into the download package, etc. Releases are also a way to tag your code with version numbers over time.
And one of the best things about releases is that you can simply link people to your releases page, and they'll always be able to see the most recent release, along with the release history.
In this blog post I've shown you an example of how I use Github to manage my PowerShell code projects. I'm not a Git expert by any means, and I'm still learning new tricks as I go along, but this flow is working well for me today. Hopefully you found it useful to help you get started with using Github for your own development projects.
If you have any questions about any of the examples above, or suggestions about how I can use Github better, please leave a comment below.