生成 20 个随机的文件, 由于没有用到 hash 名字,文件名有可能会重复
每个文件中有 30-50 条序列
每条序列的长度为 70-120 个碱基
- import os
- import random
- import string
- print (dir(string))
- letter = string.ascii_letters
- os.chdir("D:\\")
- bases = {1:"A", 2:"T", 3:"C", 4:"G"}
- ## Test random module , get random DNA base
- Nth = random.randint(1,4)
- print (bases[Nth])
- ## Create random DNA sequences
- for i in range(20):
- Number_of_Seq = random.randint(30,50)
- filename = letter[i]
- with open("Sequences"+filename + str(Number_of_Seq)+ ".fasta", "w") as file_output:
- for j in range(Number_of_Seq):
- each_Seq=""
- Rand_len = random.randint(70,120)
- for k in range(Rand_len):
- Nth = random.randint(1,4)
- each_Seq += bases[Nth]
- file_output.write(">seq_"+str(Number_of_Seq)+ "_"+str(Rand_len)+"\n")
- file_output.write(each_Seq+"\n")
来源: