我是R的新手。
我在第3列到第6列的数据框中有一些值,我想将它们绘制在点图中。第3列到第6列,每列表示一个月,行表示从1到30的月份中的某一天。数据帧内的数字表示温度。
我想画一个图,在y轴上有温度,在x轴上有月份。然后,你可以在图上有代表每个温度的点和一条经过的线,在那里你可以跟踪每个月的平均温度。
但是有些温度是相同的,所以我想给其中一个添加一个非常小的值,这样你就可以在最常见的温度下看到很多点。
我试过了:
boxplot(dat3[,3:6],dat3=mean, geom="point", shape=18,
size=3, color="red")
然而,这并不能在平均值之间画出一条线,并将温度绘制为条形图。我只想要点和平均值之间的线。
这有可能吗?
谢谢你们所有人。
发布于 2015-11-24 01:58:59
我编造了一个很小的(不真实的)数据框架,但是你可以加入你自己的数据。
structure(list(Month = structure(1:4, .Label = c("April", "May",
"June", "July"), class = "factor"), X1 = c(50, 55, 57, 68), X2 = c(60,
66, 68.4, 81.6), X3 = c(65, 71.5, 74.1, 88.4), X4 = c(40, 44,
45.6, 54.4), X5 = c(50, 55, 57, 68), X6 = c(60, 66, 68.4, 81.6
), X7 = c(65, 71.5, 74.1, 88.4), X8 = c(40, 44, 45.6, 54.4),
X9 = c(50, 55, 57, 68), X10 = c(60, 66, 68.4, 81.6), X11 = c(65,
71.5, 74.1, 88.4), X12 = c(40, 44, 45.6, 54.4), X13 = c(50,
55, 57, 68), X14 = c(60, 66, 68.4, 81.6), X15 = c(65, 71.5,
74.1, 88.4), X16 = c(40, 44, 45.6, 54.4), X17 = c(50, 55,
57, 68), X18 = c(60, 66, 68.4, 81.6), X19 = c(65, 71.5, 74.1,
88.4), X20 = c(40, 44, 45.6, 54.4), X21 = c(50, 55, 57, 68
), X22 = c(60, 66, 68.4, 81.6), X23 = c(65, 71.5, 74.1, 88.4
), X24 = c(40, 44, 45.6, 54.4), X25 = c(50, 55, 57, 68),
X26 = c(60, 66, 68.4, 81.6), X27 = c(65, 71.5, 74.1, 88.4
), X28 = c(40, 44, 45.6, 54.4), X29 = c(50, 55, 57, 68),
X30 = c(50, 55, 57, 68)), .Names = c("Month", "X1", "X2",
"X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10", "X11", "X12",
"X13", "X14", "X15", "X16", "X17", "X18", "X19", "X20", "X21",
"X22", "X23", "X24", "X25", "X26", "X27", "X28", "X29", "X30"
), row.names = c(NA, -4L), class = "data.frame")
经过一些清理工作后,有几种方法可以绘制您的数据,但以下是其中一种:
library(dplyr)
df$Month <- factor(df$Month, levels = c("April", "May", "June", "July")) # changed the order from alphabetical
df.m <- melt(df, id.vars = "Month") # melted the data frame into long format
df.m$variable <- str_replace_all(string = df.m$variable, pattern = "X", replacement = "") # remove the X before dates
avg.temp <- df.m %>% group_by(Month) %>% summarise(avg = mean(value)) # calculated the monthly mean for plotting
library(ggplot2)
ggplot(df.m, aes(x = factor(variable), y = value)) +
geom_point() +
geom_point(data = avg.temp, aes(x = 15, y = avg), size = 7, color = "red") +
facet_wrap(~Month) +
theme_bw() +
labs(x = "Days of the Month", y = "Temperature (F)", title = "Distribution of Temperatures -- Monthly Mean in Red")
发布于 2015-11-24 16:17:26
使用ggplot2 (用于绘图)、tidyr (用于将表格转换为更易于处理的数据框)和dplyr (用于处理数据框)的解决方案:
df <- structure(list(Jan = c(50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L,
50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L,
60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 50L), Feb = c(50L, 60L,
65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L,
40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L,
50L, 50L), Mar = c(50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L,
60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L,
65L, 40L, 50L, 60L, 65L, 40L, 50L, 50L), Apr = c(50L, 60L, 65L,
40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L,
50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L,
50L), May = c(50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L,
65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L, 40L, 50L, 60L, 65L,
40L, 50L, 60L, 65L, 40L, 50L, 50L), Jun = c(55L, 66L, 71L, 44L,
55L, 66L, 71L, 44L, 55L, 66L, 71L, 44L, 55L, 66L, 71L, 44L, 55L,
66L, 71L, 44L, 55L, 66L, 71L, 44L, 55L, 66L, 71L, 44L, 55L, 55L
), Jul = c(57L, 68L, 74L, 45L, 57L, 68L, 74L, 45L, 57L, 68L,
74L, 45L, 57L, 68L, 74L, 45L, 57L, 68L, 74L, 45L, 57L, 68L, 74L,
45L, 57L, 68L, 74L, 45L, 57L, 57L), Aug = c(68L, 81L, 88L, 54L,
68L, 81L, 88L, 54L, 68L, 81L, 88L, 54L, 68L, 81L, 88L, 54L, 68L,
81L, 88L, 54L, 68L, 81L, 88L, 54L, 68L, 81L, 88L, 54L, 68L, 68L
)), .Names = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul",
"Aug"), class = "data.frame", row.names = c(NA, -30L))
library(ggplot2)
library(tidyr)
library(dplyr)
df.temps <- df %>% select(Mar:Jun) %>% gather(month, temperature)
df.avg <- df.temps %>% group_by(month) %>% summarise(average=mean(temperature))
ggplot() +
geom_point(data=df.temps, aes(x=temperature, y=month), position=position_jitter(width=1, height=0)) +
geom_point(data=df.avg, aes(x=average, y=month), color="red", size=3) +
geom_line(data=df.avg, aes(x=average, y=month, group=NA)) +
labs(x = "Temperature (in F)", y = "Month")
https://stackoverflow.com/questions/33882551
复制