前往小程序,Get更优阅读体验!
立即前往
发布
社区首页 >专栏 >python读取空间转录组数据

python读取空间转录组数据

作者头像
生信技能树
发布2025-03-06 23:32:26
发布2025-03-06 23:32:26
5700
代码可运行
举报
文章被收录于专栏:生信技能树生信技能树
运行总次数:0
代码可运行

生信技能树从今年开始会大力推行 python 版本的生信生态,推广很多关于 python 版本的生信分析教程。敬请关注~新专辑《python生信笔记2025》

上一期我们学习了使用python读取不同的单细胞数据:python版读取不同的单细胞数据格式(单样本与多样本),今天来看看使用python读取空间转录组的数据。

0.示例数据准备

此次教程分析使用数据:10x官方的Mouse Brain (Coronal) Visium dataset数据集。

下载链接:https://www.10xgenomics.com/datasets/mouse-brain-section-coronal-1-standard-1-0-0

下载:

代码语言:javascript
代码运行次数:0
复制
# Output Files
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_molecule_info.h5
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_filtered_feature_bc_matrix.h5
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_filtered_feature_bc_matrix.tar.gz
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_spatial.tar.gz
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_web_summary.html
wget https://cf.10xgenomics.com/samples/spatial-exp/1.0.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_cloupe.cloupe

整理成如下格式:

代码语言:javascript
代码运行次数:0
复制
mouse-brain-section-coronal-1-standard-1-1-0/
├── filtered_feature_bc_matrix
│   ├── barcodes.tsv.gz
│   ├── features.tsv.gz
│   └── matrix.mtx.gz
├── filtered_feature_bc_matrix.h5
├── spatial
│   ├── aligned_fiducials.jpg
│   ├── detected_tissue_image.jpg
│   ├── scalefactors_json.json
│   ├── tissue_hires_image.png
│   ├── tissue_lowres_image.png
│   └── tissue_positions_list.csv

1.stlearn读取:

参考:https://stlearn.readthedocs.io/en/latest/tutorials/stSME_clustering.html

环境配置:这个软件特别不好安装

代码语言:javascript
代码运行次数:0
复制
conda create -n stlearn python=3.8 -y
conda activate stlearn
pip install stlearn==0.4.0 -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
pip install jupyterlab -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
pip install --upgrade scanpy

使用 st.Read10X 函数读取:

代码语言:javascript
代码运行次数:0
复制
import numpy as np
import pandas as pd
import stlearn as st
from pathlib import Path
import os
os.environ["PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION"] = "python"

# 读取数据
data = st.Read10X("mouse-brain-section-coronal-1-standard-1-1-0/")
data

读取进来后是一个 AnnData 对象, 在经过 stlearn 软件进行 标准化,降维聚类:

代码语言:javascript
代码运行次数:0
复制
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)

# pre-processing for spot image
st.pp.tiling(data, "./tiles")

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)

data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)

Kmeans 聚类结果:

代码语言:javascript
代码运行次数:0
复制
# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=19, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")

louvain 聚类结果:

代码语言:javascript
代码运行次数:0
复制
# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=17,use_rep='X_pca')
st.tl.clustering.louvain(data_SME, resolution=1.19)
st.pl.cluster_plot(data_SME,use_label="louvain")

2.scanpy读取

参考:https://scanpy.readthedocs.io/en/stable/tutorials/basics/clustering.html

使用 sc.read_visium 读取,读取进来之后,预处理方式与 单细胞一样:

代码语言:javascript
代码运行次数:0
复制
import scanpy as sc
adata = sc.read_visium(path="../stLearn/mouse-brain-section-coronal-1-standard-1-1-0/")
adata

然后简单的预处理,并降维聚类:

代码语言:javascript
代码运行次数:0
复制
# mitochondrial genes, "MT-" for human, "Mt-" for mouse
adata.var["mt"] = adata.var_names.str.startswith("MT-")
# ribosomal genes
adata.var["ribo"] = adata.var_names.str.startswith(("RPS", "RPL"))
# hemoglobin genes
adata.var["hb"] = adata.var_names.str.contains("^HB[^(P)]")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mt", "ribo", "hb"], inplace=True, log1p=True)

# Saving count data
adata.layers["counts"] = adata.X.copy()
# Normalizing to median total counts
sc.pp.normalize_total(adata)
# Logarithmize the data
sc.pp.log1p(adata)

# 高变基因鉴定
sc.pp.highly_variable_genes(adata, n_top_genes=2000)
sc.pl.highly_variable_genes(adata)
# pca
sc.tl.pca(adata)
sc.pp.neighbors(adata)
sc.tl.umap(adata)
# Using the igraph implementation and a fixed number of iterations can be significantly faster, especially for larger datasets
sc.tl.leiden(adata, flavor="igraph", n_iterations=2)

可视化看一下:

代码语言:javascript
代码运行次数:0
复制
sc.pl.umap(adata, color=["leiden"])

空间聚类图:

代码语言:javascript
代码运行次数:0
复制
sc.pl.spatial(adata, img_key = "hires", color="leiden", size=1.2)

需要注意的事,sc.read_visium 这个在 scanpy 1.11.0以后的版本中 已经停止使用:

下一期分享使用 SpatialData 读取空转 10X visum HD的数据~

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-03-04,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 生信技能树 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 0.示例数据准备
  • 1.stlearn读取:
  • 2.scanpy读取
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档