Introduction to GenBank
Here are some sample GenBank files:
NM_003938
XM_045915
NM_000229
BC000578
Study the organization of these files. Notice indentation, use of slashes,
capitalization, spacing, etc.
You can download GenBank files from GenBank. Try doing a nucleotide search
for melanoma. The search results will return a list of records containing
the term melanoma. Select one of these records. Once it is displayed click
the "Text" button near the top of the page. Download this file by selecting
File-->Save As... and give it an appropriate name.
Here's a little script which allows a user to view an entire file based on
the name of the file in a directory:
#!/usr/bin/perl -w
print "Available Files:\n";
@files = ();
$folder = 'GENBANK';
unless(opendir(FOLDER, $folder)){
print "Cannot open folder\n";
exit;
}
@files = readdir(FOLDER);
closedir(FOLDER);
foreach $f (@files){
print "$f\n";
}
print "Enter name of file: ";
$filename = ;
chomp $filename;
$filename = 'GENBANK/'.$filename;
open(FH, $filename);
@data = ;
close(FH);
print @data;
In order for this script to work you must have all your GenBank files stored
in a directory called GENBANK and the script must exist in the same
directory as the GENBANK directory.
Here's a slightly modifed version of this script which returns the ACCESSION
number contained in the file:
#!/usr/bin/perl -w
print "Available Files:\n";
@files = ();
$folder = 'GENBANK';
unless(opendir(FOLDER, $folder)){
print "Cannot open folder\n";
exit;
}
@files = readdir(FOLDER);
closedir(FOLDER);
foreach $f (@files){
print "$f\n";
}
print "Enter name of file: ";
$filename = ;
chomp $filename;
$filename = 'GENBANK/'.$filename;
open(FH, $filename);
@data = ;
close(FH);
foreach my $line (@data){
if($line =~ /^ACCESSION/ ){
$line =~ s/^ACCESSION\s*//;
chomp ($line);
print "$line\n";
exit;
}
}
ASSIGNMENT:
First of all you will need to download another six files from GenBank so
that you have a directory containing ten GenBank files. Next you will write
a script which lists the files in your GENBANK directory, allows the user to
select a file from the list, and then displays accession number, base count,
and source for the selected file.