Using Git to Get Code


Using Git to retrieve a current baseline of the code

Aside from it's popularity as an open source version control system (VCS), Git is also really useful as a way to get current baselines of software. Let's work through a typical example of how this works.

In this example, the systems admin clones a git repository from the awesome GitHub repository which hosts many excellent open source projects. I recently created a GitHub project for CM Best Practices and will be using it to promote tools and procedures for Configuration Audits. But for now, we will use Github to pull down a copy of libpcap, which is used by many projects for low level network monitoring. Next we will configure, build and then install libpcap. This is a common task for a systems administrator.

GNU Make for Building and Installing Linux Packages

I am sometimes surprised to hear that some build engineers are only familiar with Ant and Maven and somehow managed never learned how to use the Make. If you are a build engineer then you really should be familiar with Ant, Maven, Make and also MS Build for Microsoft Windows applications. There are others, but these are the most build tools that you will come across.

1. The first step is to clone the repository, pulling a local copy to your workspace.

[testuser@ip-123-123-123-123 gitwork]$ git clone git://github.com/mcr/libpcap.git
Initialized empty Git repository in /home/testuser/gitwork/libpcap/.git/
remote: Counting objects: 9383, done.
remote: Compressing objects: 100% (2336/2336), done.
remote: Total 9383 (delta 6951), reused 9364 (delta 6933)
Receiving objects: 100% (9383/9383), 2.63 MiB | 1711 KiB/s, done.
Resolving deltas: 100% (6951/6951), done.

2. Next take a look at the files that you have downloaded and most importantly take note of the configure script.
(I will be writing an article in the future on Autoconf and other configure utilities.) The configure script sets the default values necessary in order to build the application. There are also some cool option such as setting the prefix that I will show briefly later in this article. But usually, I just take the defaults and run the configure script as shown in step 3. (Note also that I abbreviated some of the output as indicated below.)


[testuser@ip-123-123-123-123 gitwork]$ ls -lt
total 12
drwxr-xr-x 14 testuser testuser 12288 Apr  8 19:40 libpcap
[testuser@ip-123-123-123-123 gitwork]$ cd libpcap
[testuser@ip-123-123-123-123 libpcap]$ ls -lt
total 2252
-rw-rw-r-- 1 testuser testuser  26446 Apr  8 19:40 CHANGES
-rw-rw-r-- 1 testuser testuser   9541 Apr  8 19:40 CREDITS
drwxrwxr-x 2 testuser testuser   4096 Apr  8 19:40 ChmodBPF
-rw-rw-r-- 1 testuser testuser  17769 Apr  8 19:40 INSTALL.txt
-rw-rw-r-- 1 testuser testuser    873 Apr  8 19:40 LICENSE
-rw-rw-r-- 1 testuser testuser  23318 Apr  8 19:40 Makefile.in
-rw-rw-r-- 1 testuser testuser   4191 Apr  8 19:40 README
-rw-rw-r-- 1 testuser testuser   2214 Apr  8 19:40 README.Win32
-rw-rw-r-- 1 testuser testuser   2810 Apr  8 19:40 README.aix
-rw-rw-r-- 1 testuser testuser   4960 Apr  8 19:40 README.dag
-rw-rw-r-- 1 testuser testuser   8264 Apr  8 19:40 README.hpux
-rw-rw-r-- 1 testuser testuser   5000 Apr  8 19:40 README.linux
-rw-rw-r-- 1 testuser testuser   3521 Apr  8 19:40 README.macosx
-rw-rw-r-- 1 testuser testuser   2045 Apr  8 19:40 README.septel
-rw-rw-r-- 1 testuser testuser   2465 Apr  8 19:40 README.sita
-rw-rw-r-- 1 testuser testuser   1687 Apr  8 19:40 README.tru64
drwxrwxr-x 2 testuser testuser   4096 Apr  8 19:40 SUNOS4
-rw-rw-r-- 1 testuser testuser   1555 Apr  8 19:40 TODO
-rw-rw-r-- 1 testuser testuser     14 Apr  8 19:40 VERSION
drwxrwxr-x 5 testuser testuser   4096 Apr  8 19:40 Win32
-rw-rw-r-- 1 testuser testuser  30161 Apr  8 19:40 aclocal.m4

** OUTPUT ABBREVIATED *****

3. Run the configure script before you try to build the application

[testuser@ip-123-123-123-123 libpcap]$ ./configure
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking target system type... i686-pc-linux-gnu
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed

** OUTPUT ABBREVIATED *****

4. Next we will build (compile and link) the application using the make command

[testuser@ip-123-123-123-123 libpcap]$ make
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./pcap-linux.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./pcap-usb-linux.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./fad-getad.c
if grep GIT ./VERSION >/dev/null; then \
read ver <./VERSION; \
echo $ver | tr -d '\012'; \
date +_%Y_%m_%d; \
else \
cat ./VERSION; \
fi | sed -e 's/.*/static const char pcap_version_string[] = "libpcap version &";/' > version.h
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./pcap.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./inet.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./gencode.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./optimize.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./nametoaddr.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./etherent.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./savefile.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./sf-pcap.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./sf-pcap-ng.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./pcap-common.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./bpf_image.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c ./bpf_dump.c
./runlex.sh flex -Ppcap_ -oscanner.c scanner.l
bison -y -p pcap_ -d grammar.y
mv y.tab.c grammar.c
mv y.tab.h tokdefs.h
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c scanner.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -Dyylval=pcap_lval -c

grammar.c
rm -f bpf_filter.c
ln -s ./bpf/net/bpf_filter.c bpf_filter.c
gcc -O2 -fpic -I.  -DHAVE_CONFIG_H  -D_U_="__attribute__((unused))" -g -O2 -c bpf_filter.c

*** OUTPUT ABBREVIATED *******

5. Next you will su to root (or use sudo) and then issue the  make install command to copy the build to the runtime location (e.g. /usr/local/lib)

make install

6. You may want to examine the options available in the configure script. For example, you can set default locations for installation using the --prefix option

[testuser@ip-123-123-123-123 libpcap]$ ./configure --help
`configure' configures this package to adapt to many kinds of systems.

Usage: ./configure [OPTION]... [VAR=VALUE]...

To assign environment variables (e.g., CC, CFLAGS...), specify them as
VAR=VALUE.  See below for descriptions of some of the useful variables.

*** OUTPUT ABBREVIATED ******

7. Even though you downloaded this code to simply compile and install, you may want to make some customizations and then tagging and branching can help you keep track of your work.

Let's first see what tags we inherited when we cloned the repository

[testuser@ip-123-123-123-123 libpcap]$ git tag
lbl_0_4
libpcap_0_5rel1
libpcap_0_5rel2
libpcap_0_6rel1
libpcap_0_6rel2
libpcap_0_7rel1
libpcap_0_7rel2
libpcap_0_8_bp
libpcap_0_8rel1
libpcap_0_8rel2
libpcap_0_8rel3
libpcap_0_9rel1
libpcap_0_9rel2
libpcap_0_9rel3
libpcap_0_9rel4
libpcap_0_9rel5
libpcap_0_9rel6
libpcap_0_9rel7
libpcap_0_9rel8
libpcap_1_0rel0
libpcap_1_1rel0

Next we will add our own tag for tragging purposes (which will only exist in out local copy)

[testuser@ip-123-123-123-123 libpcap]$ git tag -a v1.0 -m'CMBP Version 1.0'

Conclusion

Version Control Systems are a must-have for controlling changes to source code during the software and systems development effort. In this article we looked at how a systems administrator might use Git to obtain a copy of libpcap and also control local baselines to track any required changes. In future articles, we will cover more details on how to manage branches and baselines using Git. Please drop me a line and let me know what features and challenges you would like us to cover in upcoming issues of the CM Best Practices Newsletter!

Bob Aiello
This e-mail address is being protected from spambots. You need JavaScript enabled to view it



More articles by this author

Tools for Application Lifecycle Managementby Bob AielloI always enjoy attending conferences and learning about the best practices promoted by my colleagues employing the latest products and tools. In my opinion, many vendors are doing an excellent job of raising the bar for tools that support Application Lifecycle Management (ALM). The leading tools today not only come with great features, but also often include process models heavily influenced by the experiences of many tech savvy (and demanding) customers. You need to understand how to benefit from today's leading tools that have matured in the competitive space of application lifecycle management (ALM).Process Over ToolsAs an (MA-level) industrial psychologist, working in software engineering, I have always focused on software process and process improvement. The mantra that I learned early on was that process was a lot more important than tools. ALM actually changes the game though. In the ALM space, tools can make or break the entire software development effort. Configuration Management is one of the most important areas where tools are just as important as the process[1]. Whether you are using Agile, Waterfall or another methodology, tools may very well be the key to your success. This is especially true when implementing CM for Agile development.Agile CM and ALMAgile Configuration Management (CM) and, by extension, Agile Application Lifecycle Management (ALM) are extremely effective. Agile has resulted in indisputable successes boasting improved productivity and quality. My career has focused on Software Process Improvement with a particular focus on Configuration Management for over twenty five years. As a practitioner, I am completely tools and process agnostic. I have seen projects that successfully employed Agile methods and other efforts that thrived on an Iterative Waterfall approach. Still most organizations need a reliable and repeatable way to manage work, allowing full traceability and clear, complete communication. Years ago, we looked to the Software Development Lifecycle (SDLC) to guide us, although process documentation often sat on the shelf along with the outdated requirements specification from the latest software or systems development effort. Many companies struggled with improving programmer productivity and some tried to use the Software Engineering Institute’s (SEI) Capability Maturity Model (CMM). These efforts often had limited success, and even those thatsucceeded had limited return on their investment due to the excessive cost and effort involved. The SEI chartered a series of Software Process Improvement Networks (SPINs) throughout the United States which provided speakers and opportunities to meet with other professionals involved with software process improvement. I had the pleasure of serving for many years on the steering committee of one of the SPINs located in a major city. Today, most of the SPIN presentations have a focus on Agile practices and most of the attendees are interested in establishing SCRUMs, iterative development and Agile testing. Agile has certainly had a major impact on software process improvement. Application Lifecycle Management (ALM) has also had a major impact upon how software development is conducted, particularly in large scale distributed environments.Application Lifecycle ManagementApplication Lifecycle Management (ALM) developed from the early days of software development lifecycle (SDLC) to provide a comprehensive software development methodology that provides guidance from requirements gathering to design, development all the way through to application deployment. In practice, ALM takes a wide focus with many organizations establishing an ALM to manage their entire software and systems delivery effort. Some organizations successfully implement ALM in a way that would not be considered Agile using a Waterfall model that has a heavy focus on completing the tasks in each phase before moving on to the next. Configuration Management, consisting of source code management, build engineering, environment configuration, change control, release management and deployment have been a key focus of ALM for some time now. Another key focus has been applying Agile principles to support and improve Configuration Management functions.Agile CM in an ALM worldAgile Configuration Management provides support for effective iterative development including fast builds, continuous integration, and test driven development (TDD) that is essential for successful Agile development. In a demanding fast-paced software or systems development effort, Agile CM can make the difference between success and failure. Establishing effective Agile CM and ALM practices can help you achieve success in your current (and future) projects.ConclusionAttending conferences and networking with colleagues is a fantastic way to learn about industry best practices. Vendors have done an amazing job of integrating toolsets to support the entire application lifecycle. You need to enjoy the benefits of these integrated approaches to ALM. Make sure you drop me a line and share your best practices too![1] Aiello, Robert and Leslie Sachs. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley, 2010.
When you deploy a release you need to verify that the correct files were copied to the target directory as expected (often called a configuration audit). You also want to verify that unauthorized changes have not occurred (e.g. tampering with a release). I have used intrusion detection tools to protect my production baseline and verify that unauthorized changes did not occur. Another approach is to write cryptographic hashes on each configuration item and then check them to verify that the file has not been changed. This is known as creating an application baseline and here is how this works. Step 1 is to create a file with a list of files to be verified. (We use the find command as shown). For our example, there are only three files as shown. find /home/bob/public_html –type f > files.inp   cat myfoo.txt 1 | 800e4d29a80abfd0aebbc0fa400e1b0d84dc006e | /home/bob/public_html/robots.txt 2 | 727d15736fdc1bef578611063cd731e774138dbb | /home/bob/public_html/readme.txt 3 | 8633ac8ccfb376c592caf8541d3ebc4c45533648 | /home/bob/public_html/index.php   Next we run the script which calculates the SHA1 hash for the readme file listed above. ./mybaseline.rb 1 | 3f290e089b0aeb08e7e2dd5e3ea6e27947b61b15 | /bob/ruby/readme.txt We will use the date command to make an “unauthorized” change to the readme.txt file and rerun the script to calculate the SHA1 hash. You now see that the hashes do not match which indicates that the readme file was changed. [bob@myserver ruby]# date >> readme.txt [bob@myserver ruby]# ./mybaseline.rb 1 | e04461d3822a9c69af132844cd59496ca17c2227 | /bob/ruby/readme.txt     Here is the code that reads each file and calculates the SHA1 hash. [bob@myserver ruby]# cat ./mybaseline.rb #!/usr/bin/ruby #mybaseline.rb require 'digest/md5' require 'digest/sha1' counter = 1 file = File.new("files.inp", "r") while (myfilename = file.gets) sha1_digest1 = Digest::SHA1.new() Dir.chdir(File.dirname(myfilename)) myfilename = myfilename.chomp! myfile = File.new(myfilename, 'r') myfile.each_line do |line| sha1_digest1 << line end puts "# | # |#" myfile.close counter = counter + 1 end   The program begins by including the md5 and sha1 digests (allowing you to calculate either hash). We read through the list of files in files.inp (created by the find command as mentioned above). Then we read each file and calculate the SHA1 hash and finally print the filename, path and the SHA1 hash.   Part 2 of this article is Verifying Baselines with Ruby!