Recommand · May 14, 2022 0


R Dataframe Filter Values

I have a dataframe looks like below:

Place   Time1   Time2    Time3    Time4    Time5    Time6    Time7   Time8   Time9 ...
CA      0.2     0.3      0.1      0.4       
MN      0.01    0.19     0.1      0.2      0.1      0.1      0.1     0.1     0.1
NY      0.15    0.05     0.2      0.1      0.1      0.2      0.2

There are many more time columns. For each row, the values of the time columns sum to 1. I want to rank the time values for each row from large to small and get the time points when the sum of these time points first pass 0.25. For example, below is a sample output (NY has three values because none of them passes 0.25 but they are tie):

CA      Time 4    
MN      Time 4 Time 2
NY      Time 3 Time 6 Time 7

You may get the data in long format, arrange the rows by Place and decreasing value. For each Place select the top values who sum up to 0.25. In case of tie, this would select all the similar values.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = -Place, 
               values_drop_na = TRUE) %>%
  arrange(Place, desc(value)) %>%
  group_by(Place) %>%
  summarise(result = toString(name[value %in% 
                      value[seq_len(match(TRUE, cumsum(value) > 0.25))]]))

#  Place result             
#  <chr> <chr>              
#1 CA    Time4              
#2 MN    Time4, Time2       
#3 NY    Time3, Time6, Time7

data

It is easier to help if you provide data in a reproducible format

df <- structure(list(Place = c("CA", "MN", "NY"), Time1 = c(0.2, 0.01, 
0.15), Time2 = c(0.3, 0.19, 0.05), Time3 = c(0.1, 0.1, 0.2), 
    Time4 = c(0.4, 0.2, 0.1), Time5 = c(NA, 0.1, 0.1), Time6 = c(NA, 
    0.1, 0.2), Time7 = c(NA, 0.1, 0.2), Time8 = c(NA, 0.1, NA
    ), Time9 = c(NA, 0.1, NA)), class = "data.frame", row.names = c(NA, -3L))



You may also like…




amazon-web-services android angular api arrays c# css dart dataframe django docker excel express firebase flutter html ios java javascript jquery json kotlin laravel linux list mongodb mysql node.js pandas php postgresql python python-3.x r react-native reactjs regex spring spring-boot sql sql-server string swift typescript vue.js