Bioinformatics Tools

Pages

Tuesday, December 27, 2011

Circular dichroism code to help in data analysis

I was looking for some kind of code for rearranging the data I get for thermal melt from CD (Circular Dichroism). No I could not get a code to convert .jsw files to CSV in batch, neither JASCO’s Spectrum Analysis software helps on that, update me if there's batch conversion option for .jsw files to CSV. You have to convert individual .jsw files to CSV and group them in one folder. What I could get is after converting .jsw files to CSVs you can get data from all the files to one CSV file that assist in data analysis. The code given below will copy the data from all files to one files from 350nm to 200nm with the file name as a header for mdeg and tension (HV).

Steps:

1.Install python (if you do not have already http://www.python.org/getit/)

2.Copy all CSV files to one folder with their names

3. Write the name of CSV in one text file and save it as file_name.txt in the same folder as your data and code

    a.You can do this by Get to the MS-DOS prompt or the Windows command line. Navigate to the directory you wish to print the contents of. If you're new to the command line, familiarize yourself with the cd command and the dir command. Once in the directory you wish to print the contents of, type this command: dir /b > file_name.txt

    b.Open the new file created with name file_name.txt on the same folder and check for the file names and if file_name.txt is also there remove it so that you only have file names listed on the text file.

4.Copy the code given below in notepad and save it as .py file (it’s a python code) in the same folder

5.Right click on the python file and Run this code on python IDLE (press F5)

6.You will get a result file with name final_file.txt. It will be a CSV files with your data for mdeg and HV shorted from 350nm to 200nm, open it with excel. You can make changes in the code to suit your needs like if you are taking data from 200nm to 260 nm, make relevant change in the python code by changing x=range(151) to x=range(61) and then outfile.write(str(350-j)) to outfile.write(str(260-j)) respectively.

7.Hope that helps, thank Rhishikesh Bargaje (he wrote code for me) if it works, write me back if you face some problem, I can try to help.



Code:



infile = open('file_name.txt','r')

s = infile.read().split('\n')

infile.close()



outfile = open('final_file.txt','w')

outfile.write('Wavelength')



for k in s:

    for w in range(2):

        if w == 0:

            outfile.write('\t' + k.replace('.csv','').replace(' ','_') + '_mdeg')

        if w == 1:

            outfile.write('\t' + k.replace('.csv','').replace(' ','_') + '_HV')

      

outfile.write('\n')

      

x = range(151)



for j in x:

    outfile.write(str(350-j))

    for i in s:

        infile = open(i,'r')

        t = infile.read().split('XYDATA\n')

        infile.close()

        data1 = t[1].split('\n\n')[0].split('\n')[j].split(',')[1]

        data2 = t[1].split('\n\n')[0].split('\n')[j].split(',')[2]      

        outfile.write('\t' + data1 + '\t' + data2)

    outfile.write('\n')

outfile.close()


##end of the code##

Alternatively, if you are acquainted with R (Download R if you haven't http://cran.r-project.org/, you can use following script to run it on R for the same result with temperature range for thermal melt from 10 degrees to 70 degrees, edit the code to customize for your use, if needed, remember that you do not have to have directory name printed for this R code and it may not work properly if there are other files in the data folder. Get acquainted with R. Thank Shrikant if you find it useful.

Code:

 ##Start of the code##

CSV_Files=list.files(path=".",pattern="\\.csv",full.names=FALSE);
ResultantMatrix=matrix(nrow=151);
ResultantMatrix[,1]=c(350:200);
for(i in 1:length(CSV_Files))
{
    Current_File=read.table(CSV_Files[[i]],header=FALSE,blank.lines.skip=FALSE);
    tempM=matrix(nrow=151,ncol=2);
    k=1;
    for(j in 21:171)
    {
        temp=strsplit(as.character(Current_File[j,1]),split=",");
        tempM[k,1]=temp[[1]][2];
        tempM[k,2]=temp[[1]][3];
        k=k+1;
       
    }
    t=as.numeric(gsub(".*(\\d+.+?)\\.csv","\\1",CSV_Files[[i]]))+9;
    colnames(tempM)=c(t,t);
    ResultantMatrix=cbind(ResultantMatrix,tempM);
   
}

write.csv(ResultantMatrix,file="Result.csv");

##End of the code##

No comments: