R 语言 SEM|结构方程模型直接间接效应|可视化

公司资讯 admin 发布时间:2024-06-21 浏览:12 次

📢 预告:新版 R 语言 SEM 可视化 Shiny APP 将于近期推送。

往期

👉 鼠标修改布局!全球独家 R 语言 SEM 可视化工具👉 完全用 R 语言实现 Manuel Delgado-Baquerizo 风格的 SEM 图👉️ 尝试用 R 语言画一个 PNAS 文章的 SEM 图👉 R 语言 SEM 之多组分析(Multigroup)👉 R 语言 SEM 之复合变量(Composite)的构造案例👉 R 语言 SEM 之使用 ggplot2 衍生包画 Nature 结构方程模型图👉 R 语言 SEM 之贝叶斯(Bayesian)结构方程模型👉 R 语言 SEM 合集(点击 👉 即可跳转。)

目录

引言1 需要的包2 需要的数据3 使用 piecewiseSEM 拟合4 SEM 可视化:计算直接和间接效应 4.0 用鼠标绘制 SEM 图示 APP(更新) 4.1 第一步:将 SEM 结果变成网络 4.2 绘图:结构方程模型路径图 4.3 绘图:直接和间接效应6 SEM:重抽样的直接和间接效应 6.1 重抽样:直接和间接效应 6.2 重抽样:直接和间接效应的置信区间7 附:getEffects 函数

引言

探索使用代码(而不是动手)计算 SEM 的直接和间接效应。

1 需要的包

library(tidyverse)library(piecewiseSEM)library(tidygraph)library(ggraph)library(igraph)

2 需要的数据

数据标准化,

# 来自 piecewiseSEM 包data(keeley)# 数据标准化keeleyScaled <- keeley |> dplyr::mutate_if( is.numeric, ~ as.numeric(scale(.x)) )str(keeleyScaled)

data.frame: 90 obs. of 8 variables:

$ distance: num 0.473 -1.381 0.505 0.505 0.309 ...

$ elev : num 3.1 -1.41 -0.87 -0.87 2.11 ...

$ abiotic : num 1.489 -1.08 0.228 1.552 -0.335 ...

$ age : num 1.1486 -0.0451 -0.8409 -0.8409 -0.2043 ...

$ hetero : num 0.6423 -1.672 1.4037 0.0656 -1.1992 ...

$ firesev : num -0.645 -0.312 -1.189 -1.008 -0.16 ...

$ cover : num 1.096 -0.673 0.812 1.588 1.913 ...

$ rich : num 0.117 -1.207 1.441 0.978 1.242 ...

3 使用 piecewiseSEM 拟合

# 拟合 SEMpSem <- psem( lm(age ~ distance, data = keeleyScaled), lm(hetero ~ distance, data = keeleyScaled), lm(abiotic ~ distance, data = keeleyScaled), lm(firesev ~ age, data = keeleyScaled), lm(cover ~ firesev, data = keeleyScaled), lm(rich ~ distance + hetero + abiotic + cover, data = keeleyScaled))# 路径系数# coefs(pSem) |> print()

4 SEM 可视化:计算直接和间接效应

整理 SEM 结果,

# 结果整理为 from、to、weight、p 数据框semCoefs <- coefs(pSem) |> dplyr::relocate( from = Predictor, to = Response, weight = Estimate, p = P.Value )semCoefs |> print(row.names = FALSE)

from to weight p Std.Error DF Crit.Value Std.Estimate

distance age -0.2781 0.0079 0.1024 88 -2.7164 -0.2781 **

distance hetero 0.3460 0.0008 0.1000 88 3.4593 0.3460 ***

distance abiotic 0.4597 0.0000 0.0947 88 4.8562 0.4597 ***

age firesev 0.4539 0.0000 0.0950 88 4.7781 0.4539 ***

firesev cover -0.4371 0.0000 0.0959 88 -4.5594 -0.4371 ***

distance rich 0.2829 0.0021 0.0892 85 3.1710 0.2829 **

hetero rich 0.3383 0.0001 0.0825 85 4.1035 0.3383 ***

abiotic rich 0.2495 0.0037 0.0837 85 2.9815 0.2495 **

cover rich 0.2866 0.0005 0.0790 85 3.6297 0.2866 ***

对于 lavaan、plspm 或者 brms 的结构方程模型的整理,见以下超链接,

👉 拟合 SEM:lavaan、piecewiseSEM 和 plspm 三个包(点击 👉 即可跳转。)

4.0 用鼠标绘制 SEM 图示 APP(更新)

将结果整理为 from、to、weight、p 数据框,并存为 .csv 文件,还可以使用独家 APP,用鼠标点击,修改 SEM 图示的布局!

APP 有一个小更新,见本期另一篇推送。

# 保存数据write.csv(semCoefs, "sem_table.csv", row.names = FALSE)

打赏并私信,可以获取此 APP!

详情见本期另一篇推送,

另见以下超链接,

👉 全球独家 SEM 工具:用鼠标点击,可视化结构方程模型(点击 👉 即可跳转。)

4.1 第一步:将 SEM 结果变成网络

此处,对于 piecewiseSEM 的结果,可以这样转换为一个网络,

# igraph 和 tidygraph 包,将数据转换为一个图(graph)semGraph <- semCoefs |> select(from, to, weight, p) |> igraph::graph_from_data_frame(directed = TRUE) |> tidygraph::as_tbl_graph()semGraph

# A tbl_graph: 7 nodes and 9 edges

#

# A directed acyclic simple graph with 1 component

#

# Node Data: 7 × 1 (active)

name

1 distance

2 age

3 firesev

4 hetero

5 abiotic

6 cover

7 rich

#

# Edge Data: 9 × 4

from to weight p

1 1 2 -0.278 0.0079

2 1 4 0.346 0.0008

3 1 5 0.460 0

# ℹ 6 more rows

4.2 绘图:结构方程模型路径图

# 字体customizedFont <- "DejaVu Sans"# 简单的 SEM 可视化semGraph |> ggraph(layout = "tree") + # sugiyama, circle, tree geom_edge_arc( strength = -0.2, aes( edge_color = ifelse(weight > 0, "+", "-"), edge_width = abs(weight), label = round(weight, 3) ), start_cap = circle(0.5), end_cap = circle(0.5), label_size = 3, angle_calc = "along", vjust = 1.5, arrow = arrow(angle = 60, length = unit(5, "pt")), family = customizedFont ) + geom_node_label( aes(label = name), size = 4, family = customizedFont ) + scale_edge_color_brewer(palette = "Set2", direction = -1) + scale_edge_width_continuous(range = c(0.2, 2)) + coord_equal(clip = "off") + theme_void( base_family = customizedFont, base_size = 18 ) + theme( legend.position = "none", strip.text = element_text(size = 12) )

Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.

ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.

This warning is displayed once every 8 hours.

Call `lifecycle::last_lifecycle_warnings()` to see where this warning was

generated.

4.3 绘图:直接和间接效应

使用代码,计算直接和间接效应,

好处是,适用性强,只要将结果整理为 from、to、weight、p 数据框(如下)就可以使用,

缺点是,无法检验间接路径显著性(但是 semEff 包可以,见后文),

getEffects 函数见文末,另存为 getEffects.R,然后 source("getEffects.R") 调用之。

# 使用函数source("getEffects.R")effectData <- getEffects(semGraph)# 绘制效应effectData |> ggplot() + aes( x = from, y = value, fill = name ) + geom_col( color = "grey0", lwd = 1.5 ) + geom_label( aes(label = round(value, 3)), size = 3, position = position_stack(vjust = 0.5), family = customizedFont ) + scale_fill_manual( name = NULL, values = c("Direct" = "green4", "Indirect" = "grey100"), na.translate = FALSE ) + facet_grid(. ~ paste("Effects to:", to)) + coord_cartesian(ylim = c(-0.2, 0.6)) + theme_minimal( base_family = customizedFont, base_size = 16, base_line_size = 2 ) + theme( panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), strip.text = element_text(size = 14) )

Warning: Removed 5 rows containing missing values or values outside the scale range

(`geom_col()`).Warning: Removed 5 rows containing missing values or values outside the scale range

(`geom_label()`).

6 SEM:重抽样的直接和间接效应

用了另一个现成的包,如下,

因为是重抽样获得,所以与上述根据结果计算略有不同,

而且,可以检验间接效应是否显著!

# https://murphymv.github.io/semEff/articles/semEff.htmllibrary(semEff)# 重抽样 1000 次(并计时)system.time( pSemBoot <- semEff::bootEff( mod = pSem, R = 100, # 可设置为 1000 或以上 seed = 1, parallel = "multicore" ))

user system elapsed

13.331 7.083 2.675

获取效应,

# 获取效应semEff <- semEff::semEff(pSemBoot)# summary(semEff) # 此代码,获取所有效应,去掉注释查看效果# 各类效应semEff::getTotEff(semEff, "rich") |> round(3) # 总效应

(Intercept) distance age hetero abiotic firesev

0.000 0.453 -0.053 0.301 0.219 -0.117

cover

0.267semEff::getDirEff(semEff, "rich") |> round(3) # 总效应

(Intercept) distance hetero abiotic cover

0.000 0.233 0.301 0.219 0.267semEff::getIndEff(semEff, "rich") |> round(3) # 总效应

(Intercept) distance age firesev

0.000 0.220 -0.053 -0.117

6.1 重抽样:直接和间接效应

因为是重抽样获得,所以与上述根据结果计算略有不同。

# 结果存储在 effRes 变量中effRes <- semEff::getEff(semEff, "rich")# 提取直接、间接、总效应dEf <- effRes$rich$DirectindEf <- effRes$rich$IndirecttotalEf <- effRes$rich$Total# 转换为数据框funGetEff <- function(Tp, Ef) { data.frame(effType = Tp, Effect = Ef, row.names = names(Ef)) |> tibble::rownames_to_column(var = "Predictor")}dEfDf <- funGetEff("Direct", dEf)indEfDf <- funGetEff("Indirect", indEf)totalEfDf <- funGetEff("Total", totalEf)# 合并数据框combinedDf <- rbind(dEfDf, indEfDf, totalEfDf) |> dplyr::filter(Predictor != "(Intercept)")# 绘图combinedDf |> dplyr::filter(effType != "Total") |> ggplot() + aes( x = Predictor, y = Effect, fill = effType ) + geom_col( color = "grey0", lwd = 1.5 ) + geom_label( aes(label = round(Effect, 3)), size = 3, position = position_stack(vjust = 0.5), family = customizedFont ) + scale_fill_manual( name = NULL, values = c("Direct" = "green4", "Indirect" = "grey100"), na.translate = FALSE ) + facet_grid(. ~ paste("Effects to: rich")) + coord_cartesian(ylim = c(-0.2, 0.6)) + theme_minimal( base_family = customizedFont, base_size = 16, base_line_size = 2 ) + theme( panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), strip.text = element_text(size = 14) )

6.2 重抽样:直接和间接效应的置信区间

bootEff <- semEff$`Bootstrapped Effects`[["rich"]]lapply(bootEff, head, 3)

$Direct

(Intercept) distance hetero abiotic cover

[1,] -6.001212e-17 0.2489151 0.3069260 0.04232212 0.32140204

[2,] 1.596619e-17 0.3025537 0.4142906 0.12890911 0.10502234

[3,] -1.513694e-17 0.2867629 0.1815622 0.29990303 0.09868543

$Indirect

(Intercept) distance age firesev

[1,] -6.001212e-17 0.1586614 -0.044291734 -0.17155155

[2,] 1.596619e-17 0.1707798 -0.014439474 -0.03006690

[3,] -1.513694e-17 0.1651902 -0.006246364 -0.01870544

$Total

(Intercept) distance age hetero abiotic firesev

[1,] -6.001212e-17 0.4075764 -0.044291734 0.3069260 0.04232212 -0.17155155

[2,] 1.596619e-17 0.4733334 -0.014439474 0.4142906 0.12890911 -0.03006690

[3,] -1.513694e-17 0.4519531 -0.006246364 0.1815622 0.29990303 -0.01870544

cover

[1,] 0.32140204

[2,] 0.10502234

[3,] 0.09868543

$Mediators

age hetero abiotic firesev cover

[1,] 0.006250031 0.13073737 0.02167396 -0.038041703 -0.20959325

[2,] 0.002024480 0.10452326 0.06423202 -0.012414993 -0.04248189

[3,] 0.001294425 0.03769445 0.12620135 -0.004951939 -0.02365738# 将 list 中的每个矩阵转换为数据框,保留其名字 names;并合并所有数据框 bind_rowsbootEffDf <- map(bootEff, as.data.frame) |> map2(names(bootEff), ~ mutate(.x, effType = .y)) |> bind_rows() |> dplyr::select(-`(Intercept)`)# 计算每个路径的均值和 95% 置信区间summaryStats <- bootEffDf |> # 将数据转换为长格式,以便进行计算 pivot_longer( cols = -effType, names_to = "Predictor", values_to = "tempEstimate" ) |> # 按分组 .by 计算均值和 95% 置信区间 summarise( Estimate = mean(tempEstimate, na.rm = TRUE), ci95Lower = quantile(tempEstimate, 0.025, na.rm = TRUE), ci95Upper = quantile(tempEstimate, 0.975, na.rm = TRUE), .by = c(effType, Predictor) )# 绘图summaryStats |> filter(!effType %in% c("Total", "Mediators")) |> ggplot() + aes( x = Predictor, y = Estimate, ymin = ci95Lower, ymax = ci95Upper, color = effType ) + geom_col( width = 0.9, lwd = 1.5, fill = "white", position = position_stack(vjust = 0.5) # position = position_dodge(1) ) + # geom_errorbar( # width = 0, # lwd = 0.5, # position = position_dodge(1), # show.legend = FALSE # ) + geom_text( label = "✱", size = 2, position = position_stack(vjust = 0.5), # position = position_dodge(1), family = customizedFont ) + scale_color_manual( name = NULL, values = c("Direct" = "green4", "Indirect" = "grey0"), na.translate = FALSE ) + facet_grid(. ~ paste("Effects to: rich")) + coord_cartesian(ylim = c(-0.2, 0.6)) + theme_minimal( base_family = customizedFont, base_size = 16, base_line_size = 2 ) + theme( panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), strip.text = element_text(size = 14) )

Warning: Stacking not well defined when not anchored on the axisWarning: Removed 5 rows containing missing values or values outside the scale range

(`geom_col()`).Warning: Removed 5 rows containing missing values or values outside the scale range

(`geom_text()`).

7 附:getEffects 函数

getEffects <- function(semDag) { # 获取拓扑排序的节点名称(假设 topo 最后的节点,是关注的自变量;最终只有一个关注的自变量) dagTopo <- igraph::topo_sort(semDag)$name lenTopo <- length(dagTopo) # 初始化一个向量来存储所有路径的直接和间接效应 dEfVector <- numeric(length(dagTopo)) # 直接效应 indEfVector <- numeric(length(dagTopo)) # 间接效应 # 循环遍历拓扑排序的节点 for (j in seq_along(dagTopo)[-lenTopo]) { # 所有简单路径 allSimplePaths <- igraph::all_simple_paths( semDag, from = dagTopo[j], to = dagTopo[lenTopo] ) # 初始化间接效应的变量 directEf <- indirectEfSum <- 0 # 遍历所有路径 # 如果路径长度大于 2,则为间接路径,需要相乘再求和 # 如果路径长度为 2,则为直接路径,直接取宽度 for (i in allSimplePaths) { eWd <- E(semDag, path = i)$weight # 路径长度为 2 直接取宽度 if (length(i) == 2) { directEf = eWd } else { # 路径长度大于 2 需要相乘再求和 indirectEfSum = indirectEfSum + prod(eWd) } } # 存储直接效应和间接效应 dEfVector[j] = directEf indEfVector[j] = indirectEfSum } # 创建一个数据框来存储结果 efDf <- data.frame( from = dagTopo[-lenTopo], to = dagTopo[lenTopo], Direct = dEfVector[-lenTopo], Indirect = indEfVector[-lenTopo] ) |> tidyr::pivot_longer(cols = c(Direct, Indirect)) |> dplyr::mutate(value = ifelse(value == 0, NA, value)) # 返回 return(efDf)}

在线咨询

点击这里给我发消息售前咨询专员

点击这里给我发消息售后服务专员

在线咨询

免费通话

24h咨询:400-5026888


如您有问题,可以咨询我们的24H咨询电话!

免费通话

微信扫一扫

微信联系
返回顶部