Skip to content

Basic Linux Unix Commands

Vanderll edited this page Oct 16, 2012 · 6 revisions

home

Table of Contents

Basic Linux/Unix Commands

  • Made for people that don't have a clue when it comes to computers (e.g. Lauren V)

Viewing Files

  • View the top 15 rows of your file (note if you leave out -15, the default is top 10 rows)
  head -15 filename.psl 

Extracting Specific Info from Complex Files Using awk

  • Motivating Example: The Rat genome was updated and BLAT aligned the probes to the genome. Need to extract probe identifier and location so a mask can be applied to the probes
Our data looks like:
Using awk to create a bed file, want to get the chromosome, start position, stop position, probeset identifiers (probe ID, then x and y coordinated on array), and strand
   awk 'NR>5{split($14,a,";"); split(a[1],b,":"); split(a[2],c,":"); print $10"\t"$12"\t"$13"\t"b[3]"\t"c[1]"\t"c[2]"\t"$9}' output4.psl > output4.bed 
Go through command step by step
  1. awk: call the library awk
  2. NR>5: NR stands for number of records > 5 (i.e. our headers from the psl file take up the first 5 rows, so read data after that)
  3. split($14, a, ";") make an array called a by splitting column 14 via semicolon. This leaves us with an array with 2 columns.
  4. split(a[1], b, ":") make an array called b by splitting array a column 1 by colon. This leaves us with 3 columns. The first 2 are all the same (ID the array) and the last is the probe ID
  5. split(a[2], c, ":") make an array called c by splitting array a column 2 by colon. This leaves us with 2 columns, one for the x and one for the y coordinates of probe on the array
  6. print $10"\t"...rest of code. This combines the columns and extracted info we want into 1 file names output4.bed

Determine How Many Rows Your File Has

  • If it takes a long time to load data into program and just want to check if you have the correct number of rows use:
  wc -l filename.bed
  • wc stands for word count
  • - l stands for lines

Making Your Files Readable, Writable and Executable By Another User

  chmod a+rwx filename
  • chmod stands for change mode
  • a+ stands for all users
  • r read
  • w write
  • x executable

Clone this wiki locally