Skip Navigation
Text:
Increase font size
Decrease font size

FastHiC: A Fast Algorithm to Detect Long-Range Chromosomal Interactions from Hi-C Data

What You Need


Installation

After downloading the FastHiC_All.zip into a chosen local folder "local_path",

How to Run

To calculate expected frequency, we recommended our utility software using Fit-Hi-C . Our utility software modified_fithic_1.0.1.tgz is downloadable here and is a modified version of Fit-Hi-C to allow for outputting an extra column for "expected frequency", in addition to all the Fit-Hi-C results. The command interface of our utility software is exactly the same as Fit-Hi-C. Please refer to Fit-Hi-C for more details at https://noble.gs.washington.edu/proj/fit-hi-c/.

To conduct peak calling, users need to prepare HiC data file for HiC_HMRF_Bayes_Files to load, which is a text file, with 4 columns respectively as fragment 1 number, fragment 2 number, observed frequency and expected frequency.

The 3 required command parameters are:

  • --interaction, HiC input data file, which is a text file, with 4 columns respectively as fragment 1 number, fragment 2 number, observed frequency and expected frequency. The example file is simHiC_67frags.txt.
  • --outestimator, HiC output parameter file, which lists estimated values of parameters in the HMRF peak calling model. The example file is simHiC_67frags_Estimated_Parameters_SFA.txt.
  • --outprob, HiC output peak probability file, which is a text file, with 5 columns respectively as fragment 1 number, fragment 2 number, observed frequency, expected frequency and peak probability. The example file is simHiC_67frags_Results_SFA.txt.
The 3 optional command parameters are:
  • --iter, the number of iterations for HMRFHiCFast to run in simulated field algorithm. By default, iter is 300.
  • --neighbor or -n, the range of neighborhood. By default, neighbor is 1.
  • --help or -h, output the help.

To run based on HMRF Bayesian method, use
java -jar FastHiC.jar --interaction simHiC_67frags.txt --outprob simHiC_67frags_Results_SFA.txt --outestimator simHiC_67frags_Estimated_Parameters_SFA.txt --iter 300

File formats

  • The input Hi-C data file is a a text file, with 4 columns respectively as fragment 1 number, fragment 2 number, observed frequency and expected frequency. For example, the first several lines of "simHiC_67frags.txt" are
    frag1 frag2 Oij Eij
    285673 285674 10 9.334681950e+00
    285673 285675 18 5.941096709e+00
    285673 285676 0 1.000000000e-04
    285673 285677 29 6.282852691e+00
    285673 285678 7 2.832127095e+00
    285673 285679 3 1.527846053e+00
    285673 285680 6 1.671614495e+00
    285673 285681 6 2.536869254e+00
    285673 285682 7 2.884160156e+00
    285673 285683 4 3.204107849e+00
    285673 285684 1 2.391641626e+00
    285673 285685 3 1.897513097e+00
    ...
  • The output Hi-C peak probability file is a a text file, with 5 columns respectively as fragment 1 number, fragment 2 number, observed frequency, expected frequency and peak probability. For example, the first several lines of "simHiC_67frags_Results_SFA.txt" are
    Frag1 Frag2 ObservedCount ExpectedCount PeakProbability
    285673 285674 10 9.33468195 0.24914244750876813
    285673 285675 18 5.941096709 0.9743078614610075
    285673 285676 0 1.0E-4 0.9058814843813054
    285673 285677 29 6.282852691 0.9999618570953849
    285673 285678 7 2.832127095 0.9801960369924354
    285673 285679 3 1.527846053 0.9351673162377278
    285673 285680 6 1.671614495 0.9596658150480121
    285673 285681 6 2.536869254 0.9728450103182065
    285673 285682 7 2.884160156 0.9120565418613166
    285673 285683 4 3.204107849 0.19889169451577293
    285673 285684 1 2.391641626 0.016418594839788803
    285673 285685 3 1.897513097 0.09089796768730307
    ...
  • The output Hi-C peak parameterfile is a a text file, with estimated parameters of theta, phi and psi outputted. For example, the texts of the file "simHiC_67frags_Estimated_Parameters_SFA.txt" are
    The estimation algorithm is simulated field algorithm.
    theta 0.8653265542218813
    inverse_phi 0.07843172509982103
    psi 0.36919426955263107
    running_time in unit of a second 21.029