Rank Transformation of an Array
up vote
3
down vote
favorite
Is there a built in function which rank transforms an array of data? By rank transformation I mean
data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}
where each value in data
is assigned a rank from minimum to maximum where the lowest value in data
is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering
does not accomplish this as we obtain
Ordering[data]
{3,7,1,2,4,5,6}
Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where
Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])
should equal
SpearmanRho[Transpose[{X,Y}]][[1,2]]
where X and Y are equally lengthed arrays of data.
functions data
add a comment |
up vote
3
down vote
favorite
Is there a built in function which rank transforms an array of data? By rank transformation I mean
data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}
where each value in data
is assigned a rank from minimum to maximum where the lowest value in data
is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering
does not accomplish this as we obtain
Ordering[data]
{3,7,1,2,4,5,6}
Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where
Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])
should equal
SpearmanRho[Transpose[{X,Y}]][[1,2]]
where X and Y are equally lengthed arrays of data.
functions data
What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20
Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30
I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38
closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
Is there a built in function which rank transforms an array of data? By rank transformation I mean
data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}
where each value in data
is assigned a rank from minimum to maximum where the lowest value in data
is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering
does not accomplish this as we obtain
Ordering[data]
{3,7,1,2,4,5,6}
Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where
Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])
should equal
SpearmanRho[Transpose[{X,Y}]][[1,2]]
where X and Y are equally lengthed arrays of data.
functions data
Is there a built in function which rank transforms an array of data? By rank transformation I mean
data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}
where each value in data
is assigned a rank from minimum to maximum where the lowest value in data
is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering
does not accomplish this as we obtain
Ordering[data]
{3,7,1,2,4,5,6}
Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where
Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])
should equal
SpearmanRho[Transpose[{X,Y}]][[1,2]]
where X and Y are equally lengthed arrays of data.
functions data
functions data
edited Nov 26 at 19:43
asked Nov 26 at 18:37
tquarton
25717
25717
What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20
Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30
I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38
closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40
add a comment |
What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20
Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30
I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38
closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40
What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20
What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20
Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30
Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30
I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38
I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38
closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40
closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40
add a comment |
3 Answers
3
active
oldest
votes
up vote
5
down vote
accepted
What about this?
Ordering[Ordering[data]]
{3, 4, 1, 5, 6, 7, 2}
Since Ordering
is the bottleneck, here a variant that needs only one call to Ordering
:
Ranking[data_] := Module[{a},
a = Range[Length[data]];
a[[Ordering[data]]] = a;
a
]
Comparison:
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
b = Ordering[Ordering[data]]; // RepeatedTiming // First
a == b
0.13
0.234
True
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
add a comment |
up vote
1
down vote
I'll answer my own question with a constructed function which does the job:
Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]
Ordering
gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.
add a comment |
up vote
0
down vote
Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]
{3, 4, 1, 5, 6, 7, 2}
This gives the same result as Ordering@Ordering@#&
if there are no ties in the input data.
If input data has ties:
Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]
{1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}
It is faster than Ordering@Ordering@#&
but slower than Henrik Schumacher's Ranking
:
SeedRandom[1]
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
0.18
b = Ordering[Ordering[data]]; // RepeatedTiming // First
0.307
c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First
0.226
a == b == c
True
A slightly faster alternative (still slower than Ranking
):
ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
d = ranks @ data; // RepeatedTiming // First
0.203
a == b == c == d
True
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
What about this?
Ordering[Ordering[data]]
{3, 4, 1, 5, 6, 7, 2}
Since Ordering
is the bottleneck, here a variant that needs only one call to Ordering
:
Ranking[data_] := Module[{a},
a = Range[Length[data]];
a[[Ordering[data]]] = a;
a
]
Comparison:
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
b = Ordering[Ordering[data]]; // RepeatedTiming // First
a == b
0.13
0.234
True
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
add a comment |
up vote
5
down vote
accepted
What about this?
Ordering[Ordering[data]]
{3, 4, 1, 5, 6, 7, 2}
Since Ordering
is the bottleneck, here a variant that needs only one call to Ordering
:
Ranking[data_] := Module[{a},
a = Range[Length[data]];
a[[Ordering[data]]] = a;
a
]
Comparison:
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
b = Ordering[Ordering[data]]; // RepeatedTiming // First
a == b
0.13
0.234
True
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
add a comment |
up vote
5
down vote
accepted
up vote
5
down vote
accepted
What about this?
Ordering[Ordering[data]]
{3, 4, 1, 5, 6, 7, 2}
Since Ordering
is the bottleneck, here a variant that needs only one call to Ordering
:
Ranking[data_] := Module[{a},
a = Range[Length[data]];
a[[Ordering[data]]] = a;
a
]
Comparison:
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
b = Ordering[Ordering[data]]; // RepeatedTiming // First
a == b
0.13
0.234
True
What about this?
Ordering[Ordering[data]]
{3, 4, 1, 5, 6, 7, 2}
Since Ordering
is the bottleneck, here a variant that needs only one call to Ordering
:
Ranking[data_] := Module[{a},
a = Range[Length[data]];
a[[Ordering[data]]] = a;
a
]
Comparison:
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
b = Ordering[Ordering[data]]; // RepeatedTiming // First
a == b
0.13
0.234
True
edited Nov 26 at 19:18
answered Nov 26 at 19:09
Henrik Schumacher
46.4k466133
46.4k466133
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
add a comment |
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
Brilliant! This does it. Thanks very much.
– tquarton
Nov 26 at 19:13
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
You're welcome.
– Henrik Schumacher
Nov 26 at 19:14
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
– tquarton
Nov 26 at 19:37
add a comment |
up vote
1
down vote
I'll answer my own question with a constructed function which does the job:
Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]
Ordering
gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.
add a comment |
up vote
1
down vote
I'll answer my own question with a constructed function which does the job:
Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]
Ordering
gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.
add a comment |
up vote
1
down vote
up vote
1
down vote
I'll answer my own question with a constructed function which does the job:
Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]
Ordering
gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.
I'll answer my own question with a constructed function which does the job:
Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]
Ordering
gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.
edited Nov 26 at 19:07
answered Nov 26 at 18:58
tquarton
25717
25717
add a comment |
add a comment |
up vote
0
down vote
Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]
{3, 4, 1, 5, 6, 7, 2}
This gives the same result as Ordering@Ordering@#&
if there are no ties in the input data.
If input data has ties:
Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]
{1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}
It is faster than Ordering@Ordering@#&
but slower than Henrik Schumacher's Ranking
:
SeedRandom[1]
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
0.18
b = Ordering[Ordering[data]]; // RepeatedTiming // First
0.307
c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First
0.226
a == b == c
True
A slightly faster alternative (still slower than Ranking
):
ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
d = ranks @ data; // RepeatedTiming // First
0.203
a == b == c == d
True
add a comment |
up vote
0
down vote
Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]
{3, 4, 1, 5, 6, 7, 2}
This gives the same result as Ordering@Ordering@#&
if there are no ties in the input data.
If input data has ties:
Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]
{1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}
It is faster than Ordering@Ordering@#&
but slower than Henrik Schumacher's Ranking
:
SeedRandom[1]
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
0.18
b = Ordering[Ordering[data]]; // RepeatedTiming // First
0.307
c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First
0.226
a == b == c
True
A slightly faster alternative (still slower than Ranking
):
ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
d = ranks @ data; // RepeatedTiming // First
0.203
a == b == c == d
True
add a comment |
up vote
0
down vote
up vote
0
down vote
Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]
{3, 4, 1, 5, 6, 7, 2}
This gives the same result as Ordering@Ordering@#&
if there are no ties in the input data.
If input data has ties:
Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]
{1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}
It is faster than Ordering@Ordering@#&
but slower than Henrik Schumacher's Ranking
:
SeedRandom[1]
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
0.18
b = Ordering[Ordering[data]]; // RepeatedTiming // First
0.307
c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First
0.226
a == b == c
True
A slightly faster alternative (still slower than Ranking
):
ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
d = ranks @ data; // RepeatedTiming // First
0.203
a == b == c == d
True
Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]
{3, 4, 1, 5, 6, 7, 2}
This gives the same result as Ordering@Ordering@#&
if there are no ties in the input data.
If input data has ties:
Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]
{1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}
It is faster than Ordering@Ordering@#&
but slower than Henrik Schumacher's Ranking
:
SeedRandom[1]
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
0.18
b = Ordering[Ordering[data]]; // RepeatedTiming // First
0.307
c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First
0.226
a == b == c
True
A slightly faster alternative (still slower than Ranking
):
ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
d = ranks @ data; // RepeatedTiming // First
0.203
a == b == c == d
True
edited Nov 27 at 0:31
answered Nov 27 at 0:24
kglr
174k9196402
174k9196402
add a comment |
add a comment |
Thanks for contributing an answer to Mathematica Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f186727%2frank-transformation-of-an-array%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20
Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30
I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38
closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40