parse(file_path, format)
ex) Counting GC Content in DNA
from Bio.SeqUtils import GC
template_dna = coding_dna.reverse_complement()
messenger_rna = coding_dna.transcribe()
coding_dna = messenger_rna.back_transcribe()
messenger_rna.translate() # *는 종결코돈
coding_dna.translate() # 바로 DNA에서 Protein으로 가능
from Bio import Entrez, SeqIO
Identify regions of similarity that may indicate functional, structural and evolutionary relationships between two biological sequence. ex) global/local alignment -- > result : match score & gap penalties
from Bio import pairwise2
blast program : blastn, blastp, blastx, tblast, tblastx
from Bio.BLAST import NCBIWWW
from Bio import motifs
from Bio.Seq import Seq
instances = [
m = motifs.create(instances) # motif로
m.instances # 다시 인스턴스로
m.consensus # the largest values in the columns of the .counts matrix
m.anticonsensus # the smallest values in the columns of the .counts matrix
Indexing a FASTQ file / Sorting a sequence file(FASTA/FASTQ) / Filtering for FASTQ file
참고 :
파이썬에서 약물을 다룰 때 주로 사용하는 package
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
IPythonConsole.ipython_useSVG=True #< set this to False if you want PNGs instead of SVGs
mol = Chem.MolFromSmiles("C1CC2=C3C(=CC=C2)C(=CN3C1)[C@H]4[C@@H](C(=O)NC4=O)C5=CNC6=CC=CC=C65")
rxn = AllChem.ReactionFromSmarts('[cH1:1]1:[c:2](-[CH2:7]-[CH2:8]-[NH2:9]):[c:3]:[c:4]:[c:5]:[c:6]:1.[#6:11]-[CH1;R0:10]=[OD1]>>[c:1]12:[c:2](-[CH2:7]-[CH2:8]-[NH1:9]-[C:10]-2(-[#6:11])):[c:3]:[c:4]:[c:5]:[c:6]:1')
Chem.RDKFingerprint() : 2047개로 구분
MACCkeys : 166개로 구분
Morgan : 각 중심 원자에서 얼만큼 떨어진 이웃한 원자까지 고려할지 radius를 정함 --> ECFP
참고 : - [2022] RDKit의 기초와 이를 이용한 화학정보학 실습(이주용)
--> ligands / alternate location / non-standard a.a residues / negative seq.num / seq. gaps / insertion code / multiple chains / hydrogen atoms
--> REMOVE!! --> SAVE
For manipulating and doing calculations on wwPDB macromolecule structure files
Information of
Hydrophobic Interaction
Hydrogen Bonds
Halogen Bonds
Main은 small molecule과 protein 간의 interaction을 찾아내는 것.
하지만, PLIP는 nucleic acid(DNA,RNA)의 결합도 찾아낼 수 있다.