In [73]:
set.seed(1113)
sample(c("Axel","Mike","Markus"))
Out[73]:
  1. 'Axel'
  2. 'Markus'
  3. 'Mike'
In [79]:
set.seed(1214)
sample(c("Axel","Mike","Markus"), size=1)
Out[79]:
'Mike'
In [3]:
library("tidyverse")
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
In [28]:
options(repr.plot.width=14, repr.plot.res=300)
In [9]:
cpu <- read_tsv("cpu.tsv") %>% mutate(timestamp = ymd_hms(timestamp))
Rows: 346230 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (2): hostname, timestamp
dbl (8): interval, CPU, %user, %nice, %system, %iowait, %steal, %idle

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
In [10]:
head(cpu)
Out[10]:
A tibble: 6 × 10
hostnameintervaltimestampCPU%user%nice%system%iowait%steal%idle
<chr><dbl><dttm><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
gaia2602024-03-01 09:45:27-11.0502.620.09096.23
gaia2602024-03-01 09:46:27-11.3302.920.16095.60
gaia2602024-03-01 09:47:27-11.4702.980.16095.38
gaia2602024-03-01 09:48:27-11.6503.250.09095.01
gaia2602024-03-01 09:49:27-11.0502.420.06096.47
gaia2602024-03-01 09:50:27-11.3802.220.06096.34
In [31]:
cpu %>%
    ggplot(aes(x=timestamp, y=`%idle`, color=hostname)) + geom_line()
Out[31]:
In [19]:
cpu2<-cpu%>%
filter(timestamp<lubridate::ymd("2024-03-31"))
In [30]:
cpu2 %>%
    ggplot(aes(x=timestamp, y=`%idle`, color=hostname)) + geom_line()
Out[30]:
In [21]:
summary(cpu2$timestamp)
Out[21]:
                      Min.                    1st Qu. 
"2024-03-01 09:45:25.0000" "2024-03-08 16:06:26.0000" 
                    Median                       Mean 
"2024-03-15 22:27:27.0000" "2024-03-15 16:42:06.9877" 
                   3rd Qu.                       Max. 
"2024-03-22 18:14:46.0000" "2024-03-29 10:20:05.0000" 
In [41]:
cpu3 <- cpu2 %>%
    group_by(hostname, lubridate::day(timestamp)) %>%
    summarize(idle=min(`%idle`))
`summarise()` has grouped output by 'hostname'. You can override using the
`.groups` argument.
In [47]:
cpu3%>%
ggplot(aes(x=idle,fill=hostname))+
geom_histogram()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Out[47]:
In [49]:
cpu3  %>%
    filter(idle>75 & idle<90)
Out[49]:
A grouped_df: 40 × 3
hostnamelubridate::day(timestamp)idle
<chr><int><dbl>
gaia2 575.80
gaia2 1189.48
gaia2 2587.04
gaia3 1289.67
gaia3 2887.86
gaia4 282.10
gaia4 381.42
gaia4 788.19
gaia4 878.73
gaia4 1276.43
gaia4 1375.38
gaia4 1476.56
gaia4 1575.30
gaia4 1675.92
gaia4 2575.66
gaia5 183.84
gaia5 282.90
gaia5 382.16
gaia5 681.99
gaia5 780.09
gaia5 978.55
gaia5 1079.08
gaia5 1178.16
gaia5 1387.42
gaia5 1480.74
gaia5 1582.00
gaia5 1675.49
gaia5 1786.24
gaia5 1883.54
gaia5 1986.82
gaia5 2077.34
gaia5 2189.68
gaia5 2287.67
gaia5 2589.70
gaia5 2689.16
gaia5 2786.28
gaia5 2889.08
jupiter4 383.03
jupiter42788.34
jupiter5 386.39
In [51]:
library(lubridate)
In [57]:
cpu2 %>%
    mutate(d=day(timestamp)) %>%
    filter(hostname=="gaia5", d==5) %>%
    ggplot(aes(x=timestamp, y=`%idle`)) + geom_line()
Out[57]:
In [61]:
cpu2 %>%
    mutate(d=day(timestamp)) %>%
    filter(hostname=="gaia5") %>%
    ggplot(aes(x=timestamp, y=`%idle`)) + geom_line() + facet_wrap(d~., scales="free_x")
Out[61]:
In [66]:
cpu2 %>%
    mutate(d=day(timestamp)) %>%
    filter(hostname=="jupiter") %>%
    ggplot(aes(x=timestamp, y=`%idle`)) + geom_line() + facet_wrap(d~., scales="free_x")
Out[66]:
In [0]:
cpu2 %>%
    mutate(d=day(timestamp)) %>%
    filter(hostname=="jupiter") %>%
    ggplot(aes(x=timestamp, y=`%idle`)) + geom_line() + facet_wrap(d~., scales="free_x")
In [67]:
mem <- read_tsv("mem.tsv") %>% mutate(timestamp = ymd_hms(timestamp))
Rows: 346229 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr  (2): hostname, timestamp
dbl (12): interval, kbmemfree, kbavail, kbmemused, %memused, kbbuffers, kbca...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
In [68]:
mem2<-mem %>%
   filter(timestamp<lubridate::ymd("2024-03-31"))%>%
filter(hostname=="jupiter") 
In [71]:
mem2%>%
    mutate(d=day(timestamp))%>%
    ggplot(aes(x=timestamp, y=`%memused`)) + geom_line() + facet_wrap(d~., scales="free_x")
Out[71]: