Showing posts with label plot. Show all posts
Showing posts with label plot. Show all posts

Thursday, October 16, 2014

Please stop using colormaps which don't translate well to grayscale

I recently printed a paper with very nice results in B&W but the color images simply did not make sense when printed in grayscale (here is the paper if you are curious). Why? Not the best choice of colormap. Jake Vanderplas reminded me of this issue with his very nice blog post.

Please don't choose colormaps for the images in your paper which do not translate well when printed in grayscale.

Check out Nature's advice on color coding. Also check out the advice here.

Feel free to suggest other useful references about this issue in the comments.

Monday, December 16, 2013

Interactive plots in ipython notebooks

A couple of ways of doing interactive plots and animations in an ipython notebook:

I am sure there are many other ways, and if you know of nice ways please post in the comments. 

Thursday, February 28, 2013

Library of plotting routines for Python and IDL

Last January I was walking around the place where the posters were being presented at the American Astronomical Society meeting in Long Beach. I stumbled upon a very interesting poster advertising the astroplotlib web page.

The purpose of this page, in the language of its curator, is that it is
[...] a multi-language astronomical library of plots. It is a collection of templates that are useful to create paper-quality figures. So far all codes are written in IDL, some in Python and very few in Mathematica. 

Most of the codes available there are written in IDL (ugh) but there are quite a few for Python. I hope the python codebase will grow in the future.

For instance, I have been using Bayesian modeling of data and I need to plot contours for the joint distribution of parameters as well as the separate posterior distribution histograms, all in the same plot. After a quick search in the website, I found the code I need to get started. 

If you have contributions (codes or suggestions), please contact the webmaster, Leonardo Ubeda.



Thursday, February 21, 2013

Example of how to do a great animation in Python

If you are interested in creating animation of simulation data in Python, perhaps this should give you some inspiration: Animating the Lorenz System in 3D.

The author solved the Lorenz system of equations and plotted the time evolution of the system in 3D. It illustrates how to make a great animation in a simple way.

You should also check out the Matplotlib Animation Tutorial written by the author, which provides the background on the Lorenz animation.

Great gallery of IPython notebooks

You know you can create IPython notebooks "a la Mathematica" and put them in your webpage, right? That's a great way of sharing a step-by-step analysis workflow with someone.

I stumbled upon a great collection of IPython notebooks that illustrate a vast range of capabilities: data analysis, statistics, interfacing with C/Fortran, plots, numpy tricks and a lot more!

You should check it out: A gallery of interesting IPython notebooks at github (via @profjsb).


Monday, January 14, 2013

Nice matplotlib tutorial

While I was searching for a solution on how to plot correctly the ticks in the x axis of a plot, I came across this very nice and clear matplotlib tutorial.

Definitely check it out.


Friday, February 24, 2012

Plots with several histograms

Creating a plot with two histograms

Here is a method that you can use to plot two histograms in the same figure sharing the same X-axis, keeping some distance between the histograms:

 def twohists(x1,x2,xmin,xmax,x1leg='$x_1$',x2leg='$x_2$',xlabel='',sharey=False):  
      """  
 Script that plots two histograms of quantities x1 and x2.   
   
 Arguments:  
 - x1,x2: arrays with data to be plotted  
 - xmin,xmax: lower and upper range of plotted values, will be used to set a consistent x-range  
      for both histograms.  
 - x1leg, x2leg: legends for each histogram       
 - xlabel: self-explanatory.  
   
 Inspired by http://www.scipy.org/Cookbook/Matplotlib/Multiple_Subplots_with_One_Axis_Label.  
   
 Rodrigo Nemmen  
 v1 Dec. 2011  
 v1.1 Feb. 2012: Added sharey argument.  
      """  
   
      pylab.clf()  
      pylab.rcParams.update({'font.size': 15})  
      fig=pylab.figure()  
        
      a=fig.add_subplot(2,1,1)  
      if sharey==True:  
           b=fig.add_subplot(2,1,2, sharex=a, sharey=a)  
      else:  
           b=fig.add_subplot(2,1,2, sharex=a)  
        
      a.hist(x1,label=x1leg,color='b')  
      a.legend(loc='best',frameon=False)  
      a.set_xlim(xmin,xmax)  
        
      b.hist(x2,label=x2leg,color='r')  
      b.legend(loc='best',frameon=False)  
        
      pylab.setp(a.get_xticklabels(), visible=False)  
   
      b.set_xlabel(xlabel)  
      b.set_ylabel('Number',verticalalignment='bottom')  
      pylab.minorticks_on()  
      pylab.subplots_adjust(hspace=0.15)  
      pylab.draw()  

... and here is a example script that uses the method above:

 """  
 Illustrates how to use the twohists method.  
 """  
 import nemmen  
 import scipy, pylab  
   
 # Generates a normal distribution  
 x1=scipy.random.standard_normal(100)  
   
 # Generates a uniform random distribution  
 x2=scipy.random.uniform(-3,3,100)  
   
 nemmen.twohists(x1,x2,-3,3,'Normal','Uniform')  
   
 pylab.show()  

... to create the following plot:



Creating a plot with three histograms


I also wrote a recipe that makes a plot with three histograms:

 def threehists(x1,x2,x3,xmin,xmax,x1leg='$x_1$',x2leg='$x_2$',x3leg='$x_3$',xlabel='',sharey=False):  
      """  
 Script that plots three histograms of quantities x1, x2 and x3.   
   
 Arguments:  
 - x1,x2,x3: arrays with data to be plotted  
 - xmin,xmax: lower and upper range of plotted values, will be used to set a consistent x-range  
      for both histograms.  
 - x1leg, x2leg, x3leg: legends for each histogram       
 - xlabel: self-explanatory.  
 - sharey: sharing the Y-axis among the histograms?  
   
 Example:  
 x1=Lbol(AD), x2=Lbol(JD), x3=Lbol(EHF10)  
 >>> threehists(x1,x2,x3,38,44,'AD','JD','EHF10','$\log L_{\\rm bol}$ (erg s$^{-1}$)',sharey=True)  
   
 Inspired by http://www.scipy.org/Cookbook/Matplotlib/Multiple_Subplots_with_One_Axis_Label.  
   
 Rodrigo Nemmen  
 v1 Dec. 2011  
 v1.1 Feb. 2012:     Added sharey keyword.  
      """  
   
      pylab.clf()  
      pylab.rcParams.update({'font.size': 15})  
      fig=pylab.figure()  
        
      a=fig.add_subplot(3,1,1)  
      if sharey==True:  
           b=fig.add_subplot(3,1,2, sharex=a, sharey=a)  
           c=fig.add_subplot(3,1,3, sharex=a, sharey=a)  
      else:  
           b=fig.add_subplot(3,1,2, sharex=a)  
           c=fig.add_subplot(3,1,3, sharex=a)            
        
      a.hist(x1,label=x1leg,color='b')  
      a.legend(loc='best',frameon=False)  
      a.set_xlim(xmin,xmax)  
        
      b.hist(x2,label=x2leg,color='r')  
      b.legend(loc='best',frameon=False)  
   
      c.hist(x3,label=x3leg,color='y')  
      c.legend(loc='best',frameon=False)  
        
      pylab.setp(a.get_xticklabels(), visible=False)  
      pylab.setp(b.get_xticklabels(), visible=False)  
   
      c.set_xlabel(xlabel)  
      b.set_ylabel('Number')  
      pylab.minorticks_on()  
      pylab.subplots_adjust(hspace=0.15)  
      pylab.draw()  

... and as before, here is a script that illustrates how to use the above method:

 """  
 Illustrates how to use the threehists method.  
 """  
 import nemmen  
 import scipy, pylab  
   
 # Generates a normal distribution  
 x1=scipy.random.standard_normal(100)  
   
 # Generates a uniform random distribution  
 x2=scipy.random.uniform(-3,3,100)  
   
 x3=scipy.random.standard_normal(1000)  
   
 nemmen.threehists(x1,x2,x3,-3,3,'Normal ($n=100$)','Uniform','Normal ($n=1000$)')  
   
 pylab.show()  

... creating this plot:



Thursday, February 9, 2012

Additional color names for matplotlib plots

When creating plots with matplotlib, we usually use the "default" color names:
  • b : blue
  • g : green
  • r : red
  • c : cyan
  • m : magenta
  • y : yellow
  • k : black
  • w : white
It turns out that matplotlib accepts not only these default color names, but the full range of html color names! So for example, you can plot some data like this:

 pylab.plot(x, y, 'DarkOrange')  

and get a "dark orange" color.

This is an easy way of specifying colors in addition to the standard ones.

Friday, December 16, 2011

Script illustrating how to do a linear regression and plot the confidence band of the fit

The script below illustrates how to carry out a simple linear regression of data stored in an ASCII file, plot the linear fit and the 2 sigma confidence band.

The script invokes the confband method described here to plot the confidence bands. The confband is assumed to be lying inside the "nemmen" module (not yet publicly available, sorry) but you can place it in any module you want.

I got the test data here to perform the fitting.

After running the script you should get the following plot:
where the best-fit line is displayed in green and the shaded area is in gray.

The script below is also available at Github Gist.

   
 import numpy, pylab, scipy  
 import nemmen  
   
 # Data taken from http://orion.math.iastate.edu/burkardt/data/regression/x01.txt.  
 # I removed the header from the file and left only the data in 'testdata.dat'.  
 xdata,ydata = numpy.loadtxt('testdata.dat',unpack=True,usecols=(1,2))  
 xdata=numpy.log10(xdata) # take the logs  
 ydata=numpy.log10(ydata)  
   
 # Linear fit  
 a, b, r, p, err = scipy.stats.linregress(xdata,ydata)  
   
 # Generates arrays with the fit  
 x=numpy.linspace(xdata.min(),xdata.max(),100)  
 y=a*x+b  
   
 # Calculates the 2 sigma confidence band contours for the fit  
 lcb,ucb,xcb=nemmen.confband(xdata,ydata,a,b,conf=0.95)  
   
 # Plots the fit and data  
 pylab.clf()  
 pylab.plot(xdata,ydata,'o')  
 pylab.plot(x,y)  
   
 # Plots the confidence band as shaded area  
 pylab.fill_between(xcb, lcb, ucb, alpha=0.3, facecolor='gray')  
   
 pylab.show()  
 pylab.draw()  

Wednesday, December 14, 2011

Quick Python reference documents

These quick reference sheets will help you to quickly learn and practice data analysis, plotting and statistics with Python:

Python data analysis reference card (created by Fabricio Ferrari)



Calculating the prediction band of a linear regression model


The method below calculates the prediction band of an arbitrary linear regression model at a given confidence level in Python.

If you use it, let me know if you find any bugs.




  1. def predband(xd,yd,a,b,conf=0.95,x=None):
  2.     """
  3. Calculates the prediction band of the linear regression model at the desired confidence
  4. level.
  5. Clarification of the difference between confidence and prediction bands:
  6. "The 2sigma confidence interval is 95% sure to contain the best-fit regression line.
  7. This is not the same as saying it will contain 95% of the data points. The prediction bands are
  8. further from the best-fit line than the confidence bands, a lot further if you have many data
  9. points. The 95% prediction interval is the area in which you expect 95% of all data points to fall."
  10. (from http://graphpad.com/curvefit/linear_regression.htm)
  11. Arguments:
  12. - conf: desired confidence level, by default 0.95 (2 sigma)
  13. - xd,yd: data arrays
  14. - a,b: linear fit parameters as in y=ax+b
  15. - x: (optional) array with x values to calculate the confidence band. If none is provided, will
  16.  by default generate 100 points in the original x-range of the data.
  17.  
  18. Usage:
  19. >>> lpb,upb,x=nemmen.predband(all.kp,all.lg,a,b,conf=0.95)
  20. calculates the prediction bands for the given input arrays
  21. >>> pylab.fill_between(x, lpb, upb, alpha=0.3, facecolor='gray')
  22. plots a shaded area containing the prediction band  
  23. Returns:
  24. Sequence (lpb,upb,x) with the arrays holding the lower and upper confidence bands
  25. corresponding to the [input] x array.
  26. References:
  27. 1. http://www.JerryDallal.com/LHSP/slr.htm, Introduction to Simple Linear Regression, Gerard
  28. E. Dallal, Ph.D.
  29. Rodrigo Nemmen
  30. v1 Dec. 2011
  31. v2 Jun. 2012: corrected bug in dy.
  32.     """
  33.     alpha=1.-conf   # significance
  34.     n=xd.size   # data sample size
  35.     if x==None: x=numpy.linspace(xd.min(),xd.max(),100)
  36.     # Predicted values (best-fit model)
  37.     y=a*x+b
  38.     # Auxiliary definitions
  39.     sd=scatterfit(xd,yd,a,b)    # Scatter of data about the model
  40.     sxd=numpy.sum((xd-xd.mean())**2)
  41.     sx=(x-xd.mean())**2 # array
  42.     # Quantile of Student's t distribution for p=1-alpha/2
  43.     q=scipy.stats.t.ppf(1.-alpha/2.,n-2)
  44.     # Prediction band
  45.     dy=q*sd*numpy.sqrt( 1.+1./n + sx/sxd )
  46.     upb=y+dy    # Upper prediction band
  47.     lpb=y-dy    # Lower prediction band
  48.     return lpb,upb,x

Calculating and plotting confidence bands for linear regression models

This method calculates the confidence band of an arbitrary linear regression model at a given confidence level in Python. Using this method you can get plots like this:
where the shaded area corresponds to the 2sigma confidence band of the linear fit shown in green. A script illustrating how to use the confband method below is available.

If you use it, let me know if you find any bugs.

Note: the scatterfit method called below is available here.




Sunday, August 21, 2011

References for learning Python

I list in this post the reading material that I used to learn Python. When I started learning it I had considerable previous experience with several programming languages. So the reading list below is targeted at people with some experience. In addition, my goal was to apply Python to science and data analysis.

If you don't have previous programming experience, one suggested starting point is the book Think Python: How to think like a computer scientist. It is available for free.

I learned the basic Python syntax with the official tutorial at http://docs.python.org/.

Fabricio Ferrari's lecture (in portuguese) illustrates what python is capable of for scientific data analysis and plotting. Inspiring.

If you don't learn object-oriented programming (OOP) you will not be able to explore the full potential of Python. I learned OOP from 
Introduction to Programming using Python, Programming Course for Biologists at the Pasteur Institute, Schuerer et al. Very clear exposition.

Tutorial on using Python for astronomical data analysis. How to plot spectra, read and manipulate FITS files and much more. Highly recommended.