Providing Solutions For Life

Unix cheat sheet

Inspite of years of working on Unix and Linux I still wasn't confident if I knew enough and hence read it in these holidays and I see that we can get our way through these amazing OS using the below commands:-

The OS uses the Filesystem to store data on the harddisk. They are different types of file storage, NTFS, FAT and etc. 

Commands to get your through!

pwd ==> Prints the working directory you are in. 


ls ==> just to list the files and directories, flags
  • -l  => output in list format
  • -a => show the hidden files
  • -h => output is more human, that is easier to understand 
cd ==> Change directory 

find location -type f -name filename   ==> to find the file that you have been looking for.

parted for partitioning ext4

mkfs ==> to create filesystems

echo $PATH  ==> print the path

env ==> to list the environment variables of 

ps -efl  ==> to print the details of the process that are running

ps $$ ==> to get PID of current shell

jobs ==> List background processes

!! ==> to run the last run command

history ==> very important to see what all commands have been run.

sleep 30 ==> to lock the terminal from taking any input for a certain amount of time.

whoami  ==> if you suffer from short term memory loss :) this helps you to know what is the log in you are using currently. 

nohup ==> prefixing it in front of a command makes the process to be immune to any futher signals until the completion of the task. 

& ==> appending this to a command runs it in the background.

fs <job_number> ==> brings the process to foreground.


Global profile (/etc/profile)
User profile (~/.bash_profile)



[million $ question is what is the difference between a job and a process? 
Jobs are one or more processes that are grouped together as a job, where job is a UNIX shell concept.]

Accessing the Google APIs using Oauth 2 in Python

In its simplest form, Oauth is as in this document I had created earlier.

Oauth is such a simple protocol and so much of fuss is being made about it, its so simple that you can understand it even by reading its Specification documentation!

Below pic shows the flow

     +--------+                               +---------------+
     |        |--(A)- Authorization Request ->|   Resource    |
     |        |                               |     Owner     |
     |        |<-(B)-- Authorization Grant ---|               |
     |        |                               +---------------+
     |        |
     |        |                               +---------------+
     |        |--(C)-- Authorization Grant -->| Authorization |
     | Client |                               |     Server    |
     |        |<-(D)----- Access Token -------|               |
     |        |                               +---------------+
     |        |
     |        |                               +---------------+
     |        |--(E)----- Access Token ------>|    Resource   |
     |        |                               |     Server    |
     |        |<-(F)--- Protected Resource ---|               |
     +--------+                               +---------------+



Authentication and Authorisation are two different things, here we will cover authorisation.
Let's take Python 

Now the first thing is to get a consumer key and secret. Which can be go by going and registering on 
on the API Access pane of the Google APIs Console.



Google recommends that we use client libraries as they are safe hence we will be using the oauth2client-module. Which is already there within the Google's Client Library.

Before we go ahead, lets discuss the "flow"

First using the Flow we will get credentials which can access a certain service and that will be stored. 

FLOW ===> Credentials ==> Storage

"Flow" class -- flow_from_clientsecrets 


got by from oauth2client.client import flow_from_clientsecrets
This is used to acquire credentials that authorise our app (to access user's data). In order for a user to grant access, OAuth 2.0 steps require your application to potentially redirect their browser multiple times. Lets see below how the flow object is instantiated using the flow class.

flow = flow_from_clientsecrets('path of client_secrets.json',
                        scope='https://www.googleapis.com/auth/Resource',
                               redirect_uri='http://example.com/auth_return')
you can download a client_secrets.json file from Google API console for your app and place it in the folder where the python script is located.
Scope is telling the script that you want access only a particular resource, like Google Groups and calendar. This is like fool proofing in case you loose control of this script, it could go crazy on all the resources. So we limit it upfront-- Damage control ;)

Though as per the flow, Credentials should have been the next topic, we just have a place holder for the credentials to be defined. 

Storage 

First we import this class by  
from oauth2client.file import Storage
observe that its oauth2client.file and not client
then 

storage = Storage('some file.dat')
this command will create a file by that name, and have the credentials stored into it for further transactions.

Now for the credentials finally, but before we go there, oauth2client.tools.run() from its documentation here is what it says

The run() function is called from your application and runs through all the
steps to obtain credentials. It takes a Flow argument and attempts to open an
authorization server page in the user's default web browser. The server asks
the user to grant your application access to the user's data. If the user
grants access, the run() function returns new credentials. The new credentials
are also stored in the Storage argument, which updates the file associated
with the Storage object.

so to get it into python we do 

from oauth2client.tools import run

Credentials 

The Flow object can create the Credentials for you.  
A Credentials object holds refresh and access tokens that authorize access to a single user's data. 
so finally to create credentials

credentials = run(flow, storage)

So the now the bare minimum code so far is as in the screenshot below, so when we run it, it opens our default browser as below


A file which we had specified earlier called "some file.dat" gets created and holds all authorisation token and etc now, which has the info as below

{"_module": "oauth2client.client", "token_expiry": "2013-11-29T15:13:12Z", "access_token": "y"...}

Now to avoid having to go through the process of going to a browser and authorising this app always, what we can do is check if there are already some credentials in the stored file and if they are there use, them else create a new one. 

if credentials is None or credentials.invalid:
        print 'invalid credentials'
        # Save the credentials in storage to be used in subsequent runs.
        credentials = run(FLOW, storage)


Finally, using these credentials we have got!

These objects are applied to httplib2.Http objects to authorize access. 

We will use the authorise() function of the Credentials class to apply necessary credential headers to all requests made by an httplib2.Http instance:

httpplib2.http is used to place requests to the web, see the below screenshot, where I am just trying to get the unauthenticated page, like Google. 

from httplib2 import Http 
note the caps in Http



so we authorise our request using the "credentials' function:

http = credentials.authorize(http='h')

Once an httplib2.Http object has been authorised, it is typically passed to the build function:


service = build('calendar', 'v3', http=http)

service = build('groupssettings', 'v1', http=http)

Agile Python Development and the Laundry list

We need method to the madness!

There are apparently two, one is waterfall and the other agile

When?

Manifesto for Agile Software Development was produced in February, 2001.

Why Agile?

Is more pragmatic, as it knows that not everything can be documented at the beginning of a project. But it does put in place some rigorous  checks to accommodate these changes!

So that any change can be accommodated, with good communication, iterative design and development. 

Agile development can be applied in any project, both testing tools and build automation tend to be very language specific. Python too has these tools.

Jargons again!

Agile methodologies 

  • XP - Extreme programming -- focuses almost exclusively on the developer and development techniques
  • DSDM - Dynamic Systems Development Method -- focuses completely on processes
  • there could be many more.

Spirit of it from manifesto!


Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan


Agile methods has collection of various techniques:-

Laundry list


on-site customers - person using the product.

pair programming - being a solo role since the beginning of my career I know how it feels to be banging your head alone to a computer, its better for two to be doing this.

sys metaphor - using the same names all around the program and treating anything else as bugs, IDEs refactoring tools help in changing names across the system.

IDEs are integrated development environments, like Pycham, Eclipse and etc. 

Refactoring is the practice of simplifying and clarifying code when ever possible.
 
Documentation - yes you read it right! there is documentation too but it should be minimal. It should be limited to that which is necessary to ensure that participants in the development process can communicate.

Simple Design is fulfilling requirements of the user stories and nothing more. the code is the design, should be as simple as possible.

Principle for it:-
- DRY - Don't repeat yourself, when you see the same thing being done again and again in your code, refactor! May be create a method or something also called as DIE - duplication is evil.  
- YAGNI - "You aren't gonna need it" -- "do the simplest thing that could possibly work" (DTSTTCPW) no point in carrying the useless burden around.
- KISS: is an acronym for the design principle "Keep it simple, Stupid!".
Better raw than wrong - just say things in simple and direct way. 

Short Iterations - The customers and the dev team meets on regular intervals, generally every week and discussion on what has been done, give a demo may be and then plan for what needs to be done in the coming week. This is to avoid any tragic surprises to the client. 

Collective Code Ownership -  it's not one man or his fathers property, due to pair programming, the strong attachment to the code is dissolved.

Producing short iterations is made possible through automation. There is automation from the Developers desk to the production environment. Human touch is only needed when things go wrong!

It is human to err and hence manual process should be avoided. 

Continuous reflection:  Is ongoing analysis of how development is going on.
documenting everything that is being done can CYA! Trust me, I've realised this in the hard way, so no matter which company you are in keep shouting and bragging on what you are doing and if possible give them some jazzy acronymos or create a buzz word so that others are intimidated on hearing it. This is how this God forsaken industry operates unfortunately. 

Continuous Integration:

Have you saved the file, or do you have a back up, if this words make your heart skip a beat then you should remember to check in the code into the repo as often as possible as a single place keep all the artifacts necessary to produce the build.
This could include build scripts, tool scripts, properties files, installation scripts, third-party libraries, tests, and tool configurations (e.g., IDE configuration files).

Since the program is built, installed, executed, and tested throughout development. This guarantees that there will be no deaths in the team at the time of production!


Test Driven Development:

Tests are written before the actual code is written to solve the problem. 

Types of tests are

Unit tests or programmer tests is level of methods and functions. With this tests changes are easy to make as they detect the bugs immediately. Unit tests fall into two broad categories: programmer tests and customer tests. What they test distinguishes them from each other.
  • Programmer tests prove that the code does what the programmer expects it to do. They verify that the code works. 

  • Customer tests (a.k.a. acceptance tests) help to ascertain that the code behaves the way the customer would expect it to. It verifies behaviour at the level of classes and complete interfaces. 
Unit testing is a part of regression testing.  
Regression tests identify bugs that have been seen before and that had been fixed. Unit tests should run fast else, programmers will not run them and that could lead to a lot of bugs at the end. 
Two of the most common tools in Python for unit testing are unittest and Nose.

Functional testing verifies that the complete application works as expected. Functional testing is done by QA department who are directly integrated into the development process. It verifies that customer sees what they would expect to see and no bugs have crept in. 

I still haven't got Agile, what the heck is it?

It simply means that you will have the requested app or program at the earliest even though it might not have the full functionality of the finished application!
 

Google APIs for python - don't feel unfortunate anymore if you have to work on these

From the past 4 years I've been unfortunate enough to work on Google API's which are those for which even Google cannot search their documentation, may be because it doesn't exist, stackoverflow no matter how much it overflows cannot give us anything much on these.

Below is the summary of all my experience to crack the Google APIs.

I've spent days together to crack one thing at times and I wished I had known these tricks so that it wouldn't have taken me this long.

So firstly try to read as much as possible from the getting started guide. This will take you to a world of links, go to those link and grab in what ever is possible (don't worry if you don't understand anything). [trade secret of software engineers is make the simple sound as complex as possible, and intimidate others!]

But the world of python is different! 
"Simple is better than complex." -- zen of python

So the magical commands in python are :-

dir()
type()
help() (keeping the best for the last part)

also inspect.getargspec()

Lets discover each one of them:-

dir() - To list the names of built-in functions and variables. 

(though it doesn't give a complete list. If you want a list of those use the standard module __builtin__ eg. dir(__builtin__)
For example if you have a class and instantiated an object for eg.
service = discovery.build('admin', 'reports_v1', http=http)
and you want to know what are the functions this object can give us.
just run dir(service) to get the list as below, (ignore the u' they are the UTF encoding stuff)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getstate__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_add_basic_methods', '_add_nested_resources', '_add_next_methods', '_baseUrl', '_developerKey', '_dynamic_attrs', '_http', '_model', '_requestBuilder', '_resourceDesc', '_rootDesc', '_schema', '_set_dynamic_attr', '_set_service_methods', u'activities', u'customerUsageReports', u'userUsageReport']
so we can see that the above result there could be 3 methods. Now to know what methods are there inside these methods, we can do this dir(service.customerUsageReports()) and get a list
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getstate__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_add_basic_methods', '_add_nested_resources', '_add_next_methods', '_baseUrl', '_developerKey', '_dynamic_attrs', '_http', '_model', '_requestBuilder', '_resourceDesc', '_rootDesc', '_schema', '_set_dynamic_attr', '_set_service_methods', u'get', u'get_next']
so we come to see that there are two more methods get and get_next for us to use!


Now if at some point you would like to know what the datatype of the object is use 

type(service.customerUsageReports()



this will give the type of the object eg. <class 'apiclient.discovery.Resource'>
Now for the best one help(), my fascination for python has increased all the more after knowing about this command. It's awesome. See how it can help us
Now how do we know in the really mediocre documented Google world, what are the parameters "get method" takes


help(collections.get)


gives the below
method(self, **kwargs) method of apiclient.discovery.Resource instance    Retrieves a report which is a collection of properties / statistics for a specific customer.        Args:      date: string, Represents the date in yyyy-mm-dd format for which the data is to be fetched. (required)      pageToken: string, Token to specify next page.      parameters: string, Represents the application name, parameter name pairs to fetch in csv as app_name1:param_name1, app_name2:param_name2.        Returns:      An object of the form:            { # JSON template for a collection of usage reports.        "nextPageToken": "A String", # Token for retrieving the next page        "kind": "admin#reports#usageReports", # Th

seeing this we know that it takes the date parameter in a certain string format. This saves lives! :)


Adding to the above, we see the below which could also be used and might help, though I haven't been much helped by this

 import inspect


inspect.getargspec(someMethod)
Get the names and default values of a Python function’s arguments. A tuple of four things is returned: (args, varargs, keywords, defaults). args is a list of the argument names (it may contain nested lists).

No more head breaking folks!  

instead break the code!