BED File Format - Definition and supported options
The BED format consists of one line per feature, each containing 3-12 columns of data, plus optional track definition lines.
Required fields
The first three fields in each feature line are required:
- chrom - name of the chromosome or scaffold. Any valid seq_region_name can be used, and chromosome names can be given with or without the 'chr' prefix.
- chromStart - Start position of the feature in standard chromosomal coordinates (i.e. first base is 0).
- chromEnd - End position of the feature in standard chromosomal coordinates
chr1 213941196 213942363 chr1 213942363 213943530 chr1 213943530 213944697 chr2 158364697 158365864 chr2 158365864 158367031 chr3 127477031 127478198 chr3 127478198 127479365 chr3 127479365 127480532 chr3 127480532 127481699
Optional fields
Nine additional fields are optional. Note that columns cannot be empty - lower-numbered fields must always be populated if higher-numbered ones are used.
- name - Label to be displayed under the feature, if turned on in "Configure tracks".
- score - A score between 0 and 1000. See track lines, below, for ways to configure the display style of scored data.
- strand - defined as + (forward) or - (reverse).
- thickStart - coordinate at which to start drawing the feature as a solid rectangle (not currently supported)
- thickEnd - coordinate at which to stop drawing the feature as a solid rectangle (not currently supported)
- itemRgb - an RGB colour value (e.g. 0,0,255). Only used if there is a track line with the value of itemRgb set to "on" (case-insensitive).
- blockCount - the number of sub-elements (e.g. exons) within the feature
- blockSizes - the size of these sub-elements
- blockStarts - the start coordinate of each sub-element
chr7 127471196 127472363 Pos1 0 + 127471196 127472363 255,0,0 chr7 127472363 127473530 Pos2 0 + 127472363 127473530 255,0,0 chr7 127473530 127474697 Pos3 0 + 127473530 127474697 255,0,0 chr7 127474697 127475864 Pos4 0 + 127474697 127475864 255,0,0 chr7 127475864 127477031 Neg1 0 - 127475864 127477031 0,0,255 chr7 127477031 127478198 Neg2 0 - 127477031 127478198 0,0,255 chr7 127478198 127479365 Neg3 0 - 127478198 127479365 0,0,255 chr7 127479365 127480532 Pos5 0 + 127479365 127480532 255,0,0 chr7 127480532 127481699 Neg4 0 - 127480532 127481699 0,0,255
Track lines
Track definition lines can be used to configure the display further, e.g. by grouping features into separate tracks. Track lines should be placed at the beginning of the list of features they are to affect.
The track line consists of the word 'track' followed by space-separated key=value pairs - see the example below. Valid parameters used by WormBase ParaSite are:
- name - unique name to identify this track when parsing the file
- description - Label to be displayed under the track in Region in Detail
- priority - integer defining the order in which to display tracks, if multiple tracks are defined.
- useScore - a value from 1 to 4, which determines how scored data will be displayed. Additional parameters may be needed, as described below.
- tiling array (example file)
- colour gradient - defaults to Yellow-Green-Blue, with 20 colour grades. Optionally you can specify the colours for the gradient (cgColour1, cgColour2, cgColour3) as either RGB, hex or X11 colour names, and the number of colour grades (cgGrades). (example file)
- histogram (example file)
- wiggle plot (example file)
- itemRgb - if set to 'on' (case-insensitive), the individual RGB values defined in tracks will be used.
track name="ItemRGBDemo" description="Item RGB demonstration" itemRgb="On" chr7 127471196 127472363 Pos1 0 + 127471196 127472363 255,0,0 chr7 127472363 127473530 Pos2 0 + 127472363 127473530 255,0,0 chr7 127473530 127474697 Pos3 0 + 127473530 127474697 255,0,0 chr7 127474697 127475864 Pos4 0 + 127474697 127475864 255,0,0 chr7 127475864 127477031 Neg1 0 - 127475864 127477031 0,0,255 chr7 127477031 127478198 Neg2 0 - 127477031 127478198 0,0,255 chr7 127478198 127479365 Neg3 0 - 127478198 127479365 0,0,255 chr7 127479365 127480532 Pos5 0 + 127479365 127480532 255,0,0 chr7 127480532 127481699 Neg4 0 - 127480532 127481699 0,0,255
BedGraph format
BedGraph is a suitable format for moderate amounts of scored data. It is based on the BED format (see above) with the following differences:
- The score is placed in column 4, not column 5
- Track lines are compulsory, and must include type=bedGraph. Currently the only optional parameters supported by WormBase ParaSite are:
- name - see above
- description - see above
- priority - see above
- graphType - either 'bar' or 'points'.
track type=bedGraph name="BedGraph Format" description="BedGraph format" priority=20 chr19 59302000 59302300 -1.0 chr19 59302300 59302600 -0.75 chr19 59302600 59302900 -0.50 chr19 59302900 59303200 -0.25 chr19 59303200 59303500 0.0 chr19 59303500 59303800 0.25 chr19 59303800 59304100 0.50 chr19 59304100 59304400 0.75