-
Notifications
You must be signed in to change notification settings - Fork 77
Closed
Description
From msprime created by marianne-aspbury: tskit-dev/msprime#802
This is a pretty basic file format to output, I have something that works for me but I don't know how to integrate it within tskit, and that'd be nice to do so noting it here for the future. (I'm also happy to do this, but with some direction).
This is what I'm doing for my personal use at the moment:
# the actual haplotype strings
haps = []
for i in ts.haplotypes():
haps.append(i)
# The ">ID" parts of the fasta format. Don't know what best info to include is,
# I'm just using sample index and the population because that's relevant for me.
# Possibly people could choose what to include for ID strings in a fasta output function call
sequence_IDs = []
for i in range(len(haps)):
sequence_IDs.append(f'sample_{ts.samples()[i]}_pop_{ts.node(i).population}')
# saving the file
with open('Sim_fasta_file.txt', 'w') as f:
for i in range(len(haps)):
f.write(f'>{sequence_IDs[i]}\n{haps[i]}\n')
Metadata
Metadata
Assignees
Labels
No labels