How to aggregate categorical data in R?

I have a dataframe which consists of two columns with categorical variables (Better, Similar, Worse). I would like to come up with a table which counts the number of times that these categories appear in the two columns.
The dataframe I am using is as follows:

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

asked 2 hours ago

Daniel

644

4

Looks like you need table(df1)

– akrun
2 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
2 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
1 hour ago

add a comment |

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

asked 2 hours ago

Daniel

644

4

Looks like you need table(df1)

– akrun
2 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
2 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
1 hour ago

add a comment |

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

asked 2 hours ago

Daniel

644

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

r aggregate

asked 2 hours ago

Daniel

644

asked 2 hours ago

Daniel

644

asked 2 hours ago

Daniel

644

asked 2 hours ago

Daniel

644

asked 2 hours ago

Daniel

644

4

Looks like you need table(df1)

– akrun
2 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
2 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
1 hour ago

add a comment |

4

Looks like you need table(df1)

– akrun
2 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
2 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
1 hour ago

Looks like you need table(df1)

– akrun
2 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
2 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
1 hour ago

add a comment |

3 Answers
3

active

oldest

votes

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered 1 hour ago

Frank

55.9k660135

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered 2 hours ago

d.b

20.5k41949

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

It, first, transforms the data from wide to long format, with column "var" including the variable names and column "val" the corresponding values. Second, it counts per "var" and "val". Finally, it spreads the data into the desired format.

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited 37 mins ago

answered 1 hour ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
1 hour ago

Please see the updated post for commentary.

– tmfmnk
1 hour ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55479506%2fhow-to-aggregate-categorical-data-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered 1 hour ago

Frank

55.9k660135

add a comment |

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered 1 hour ago

Frank

55.9k660135

add a comment |

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered 1 hour ago

Frank

55.9k660135

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered 1 hour ago

Frank

55.9k660135

answered 1 hour ago

Frank

55.9k660135

answered 1 hour ago

Frank

55.9k660135

answered 1 hour ago

Frank

55.9k660135

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered 2 hours ago

d.b

20.5k41949

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered 2 hours ago

d.b

20.5k41949

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered 2 hours ago

d.b

20.5k41949

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered 2 hours ago

d.b

20.5k41949

answered 2 hours ago

d.b

20.5k41949

answered 2 hours ago

d.b

20.5k41949

answered 2 hours ago

d.b

20.5k41949

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited 37 mins ago

answered 1 hour ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
1 hour ago

Please see the updated post for commentary.

– tmfmnk
1 hour ago

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited 37 mins ago

answered 1 hour ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
1 hour ago

Please see the updated post for commentary.

– tmfmnk
1 hour ago

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited 37 mins ago

answered 1 hour ago

tmfmnk

3,6561516

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited 37 mins ago

answered 1 hour ago

tmfmnk

3,6561516

edited 37 mins ago

answered 1 hour ago

tmfmnk

3,6561516

answered 1 hour ago

tmfmnk

3,6561516

answered 1 hour ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
1 hour ago

Please see the updated post for commentary.

– tmfmnk
1 hour ago

add a comment |

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
1 hour ago

Please see the updated post for commentary.

– tmfmnk
1 hour ago

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
1 hour ago

Please see the updated post for commentary.

– tmfmnk
1 hour ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtyktl