Doing regression only with correlation matrix, means, and SDs [duplicate]
up vote
1
down vote
favorite
This question already has an answer here:
Is there a way to use the covariance matrix to find coefficients for multiple regression?
1 answer
I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3
) and a dependent variable (y
) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?
I highly appreciate an R
demonstration.
ns <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)
r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)
rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))
r regression multiple-regression regression-coefficients
marked as duplicate by whuber♦
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
up vote
1
down vote
favorite
This question already has an answer here:
Is there a way to use the covariance matrix to find coefficients for multiple regression?
1 answer
I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3
) and a dependent variable (y
) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?
I highly appreciate an R
demonstration.
ns <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)
r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)
rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))
r regression multiple-regression regression-coefficients
marked as duplicate by whuber♦
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
1
You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber♦
Dec 5 at 14:35
Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29
@whuber, do I need to first convert my correlation matrix into a var-covariance matrix usingr_x_iy_i * sd_x_i*sd_y_i
and then go from there?
– rnorouzian
Dec 10 at 4:52
That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber♦
Dec 10 at 15:13
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
This question already has an answer here:
Is there a way to use the covariance matrix to find coefficients for multiple regression?
1 answer
I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3
) and a dependent variable (y
) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?
I highly appreciate an R
demonstration.
ns <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)
r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)
rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))
r regression multiple-regression regression-coefficients
This question already has an answer here:
Is there a way to use the covariance matrix to find coefficients for multiple regression?
1 answer
I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3
) and a dependent variable (y
) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?
I highly appreciate an R
demonstration.
ns <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)
r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)
rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))
This question already has an answer here:
Is there a way to use the covariance matrix to find coefficients for multiple regression?
1 answer
r regression multiple-regression regression-coefficients
r regression multiple-regression regression-coefficients
edited Dec 10 at 4:15
asked Dec 5 at 3:49
rnorouzian
6741820
6741820
marked as duplicate by whuber♦
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by whuber♦
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
1
You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber♦
Dec 5 at 14:35
Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29
@whuber, do I need to first convert my correlation matrix into a var-covariance matrix usingr_x_iy_i * sd_x_i*sd_y_i
and then go from there?
– rnorouzian
Dec 10 at 4:52
That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber♦
Dec 10 at 15:13
add a comment |
1
You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber♦
Dec 5 at 14:35
Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29
@whuber, do I need to first convert my correlation matrix into a var-covariance matrix usingr_x_iy_i * sd_x_i*sd_y_i
and then go from there?
– rnorouzian
Dec 10 at 4:52
That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber♦
Dec 10 at 15:13
1
1
You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber♦
Dec 5 at 14:35
You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber♦
Dec 5 at 14:35
Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29
Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29
@whuber, do I need to first convert my correlation matrix into a var-covariance matrix using
r_x_iy_i * sd_x_i*sd_y_i
and then go from there?– rnorouzian
Dec 10 at 4:52
@whuber, do I need to first convert my correlation matrix into a var-covariance matrix using
r_x_iy_i * sd_x_i*sd_y_i
and then go from there?– rnorouzian
Dec 10 at 4:52
That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber♦
Dec 10 at 15:13
That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber♦
Dec 10 at 15:13
add a comment |
1 Answer
1
active
oldest
votes
up vote
4
down vote
- Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)
$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$
Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.
Using the information about means, you can get the estimate of intercept.
1
I do not know how to use R.
– user158565
Dec 5 at 5:00
1
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
- Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)
$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$
Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.
Using the information about means, you can get the estimate of intercept.
1
I do not know how to use R.
– user158565
Dec 5 at 5:00
1
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
add a comment |
up vote
4
down vote
- Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)
$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$
Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.
Using the information about means, you can get the estimate of intercept.
1
I do not know how to use R.
– user158565
Dec 5 at 5:00
1
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
add a comment |
up vote
4
down vote
up vote
4
down vote
- Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)
$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$
Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.
Using the information about means, you can get the estimate of intercept.
- Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)
$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$
Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.
Using the information about means, you can get the estimate of intercept.
edited Dec 5 at 5:58
answered Dec 5 at 4:29
user158565
5,1451318
5,1451318
1
I do not know how to use R.
– user158565
Dec 5 at 5:00
1
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
add a comment |
1
I do not know how to use R.
– user158565
Dec 5 at 5:00
1
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
1
1
I do not know how to use R.
– user158565
Dec 5 at 5:00
I do not know how to use R.
– user158565
Dec 5 at 5:00
1
1
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20
add a comment |
1
You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber♦
Dec 5 at 14:35
Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29
@whuber, do I need to first convert my correlation matrix into a var-covariance matrix using
r_x_iy_i * sd_x_i*sd_y_i
and then go from there?– rnorouzian
Dec 10 at 4:52
That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber♦
Dec 10 at 15:13