Three ways you can
help :
|
- |
as a registered
annotator. If you want full access, you have to register by sending an E-mail to
Diego Martinez
stating your name, affiliation, chosen username and password,
and who introduced you to Chlamy genome annotation. Notify
Olivier Vallon
so that your name be added to the mailing list |
- |
as an occasional contributor, in which case you need to find in the list of annotators (« tasklist.xls »)
someone who is ready to enter your data under his/her name.
Defaults :
O. Vallon or S.
Merchant |
- |
from a sequence :
TblastN on genome (for additional info, you can usebonus
scaffolds or non-assembled reads but no gene models there) |
- |
as a seed.
If you know people not necessarily working with Chlamy but
with the skill and expertise needed in a field that is not
properly covered yet, ask them to join for free! |
In any event, you need to know
what is expected in terms of depth of annotation. Read these
guidelines carefully, but then find the procedure that suits you best. There are
as many ways to annotate as there are annotators. Feel free
to emphasize aspects you deem more important, it is OK as long
as you are entering annotation that is both accurate
and useful to others not familiar with this particular type of genes. It need not be as thorough as suggested below &the community welcomes all levels of participation.
You can use the « annotation
spreadsheet.xls » to collect and organize your
data before or as you are entering it.
You can only annotate the
filtered gene models of version 2.0.
It is not possible for the moment to enter or modify gene models,
so if no model covers your gene, add annotation to the gene
model nearby as « next to gene X (coordinates) coding
for & ». Your annotation will be carried over
into version 3.0
The
annotation process:
Start with genes or gene families
you are most familiar with, then
work your way into areas you know less about. If your annotation
is vague, chances are it will be completed by another user.
Do not hesitate to add your annotation to genes already annotated by others. If you think the annotation out
there is wrong or misleading, contact its author by email (list
of annotators is at this web site : « TaskList.xls »)
and discuss changing it.
Start with the protein page
for your gene model, check the validity of the automatic annotation provided.
Check gene structure on the browser, by comparing with EST data,
alignments. By clicking « annotate this protein »,
you access the transcript page,
which displays the best Smith-Waterman hits (you can change
the filter), INTERPRO domains, GO codes, E.C. number, etc.
You can add to a field or edit it if it was entered by
yourself. Both your annotation and that of previous users will
be displayed. Fields you can fill :
Gene name :
only if you know enough about that protein to give the gene
a meaningful name. Consult the « guidelines »
on the Chlamy Center website : the favorite is a 3-letter
radical followed by a numeral, but other options are possible.
Check that you are not giving a radical already used with another
meaning. This can be done through the advanced search page by
entering for example ABC* to search for the radical ABC.
Description :
the most important field. Your entry will be searchable together
with gene name by users at
the « advanced search » page, so
be as thorough as possible. Suggested order : protein
name/function (often you can just
copy/paste from a Swissprot entry and this is the preferred
description line); alternative names ;
alternative gene names (in parentheses,
including gene names in other organisms) ; identical
to XXXX, if already cloned in
Chlamy (or « almost identical » if there
are differences) ; subcellular localization (experimental,
by sequence similarity &); expression
pattern etc & Add as much literature
information as needed, in the form of a PMID number (not whole
citation). You can also add negative information,
like « not a protein kinase », if the
automatic annotation that guided you there is wrong. « Probably
not a gene » is also useful. There will be many « unknown
function », or « conserved in eukaryotes »
etc & In general « unknown » means
you believe there is a gene, but know nothing of its function,
« hypothetical » means no hit no EST data.
In general, do not leave a protein model you spent any time
on as unannotated &
Model notes :
will not be searched, but useful for any detailed analysis.
Start by choosing from the pull-down menu (for example « no
issue », or « sequence gap impacts model »).
Indicate if gene structure needs to be modified, e.g. « 6th
intron not supported by EST data », or whether you
suspect misassembly, e.g. « C-terminus probably represented
by C_xxxxxx ». This is where you enter splicing variants
/ alternative transcription starts (if biologically significant,
enter also in the description), or clustering with genes of
related function, or overlapping with neighboring genes etc...
EST evidence :
choose yes or no. Watch for genomic contaminants, ESTs that
match other places in the genome etc... In the comment line,
add info for the EST contig(s) best describing the transcript
(copy from the feature line)
Gene Ontology :
if you feel this is important, either add from the automatic
ontology field (domain, GO, EC code) or fill in the User-Assigned
Ontology field. If needed, request creation of a new GO code.
Annotation strategies:
Various ways you can
choose the genes you annotate:
|
- |
with a keyword or
gene name : « advanced search »
the annotations, or « simple search »
the Smith Waterman (Blast) hits, or use GO or KEGG search
pages to retrieve gene models mapped to that function |
- |
from a domain : : « simple
search » INTERPRO hits for an IP number (but
significance is not assessed) |
- |
from a sequence :
TblastN on genome (for additional info, you can usebonus
scaffolds or non-assembled reads but no gene models there) |
- |
by just hopping
to the next unannotated gene on the scaffold &........ |