Doing regression only with correlation matrix, means, and SDs [duplicate]











up vote
1
down vote

favorite
1













This question already has an answer here:




  • Is there a way to use the covariance matrix to find coefficients for multiple regression?

    1 answer




I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3) and a dependent variable (y) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?



I highly appreciate an R demonstration.



ns    <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)

r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)

rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))









share|cite|improve this question















marked as duplicate by whuber r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.











  • 1




    You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
    – whuber
    Dec 5 at 14:35










  • Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
    – Matt Barstead
    Dec 10 at 2:29










  • @whuber, do I need to first convert my correlation matrix into a var-covariance matrix using r_x_iy_i * sd_x_i*sd_y_i and then go from there?
    – rnorouzian
    Dec 10 at 4:52












  • That's one approach, because it reduces your problem to one with a known, explicit solution.
    – whuber
    Dec 10 at 15:13















up vote
1
down vote

favorite
1













This question already has an answer here:




  • Is there a way to use the covariance matrix to find coefficients for multiple regression?

    1 answer




I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3) and a dependent variable (y) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?



I highly appreciate an R demonstration.



ns    <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)

r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)

rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))









share|cite|improve this question















marked as duplicate by whuber r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.











  • 1




    You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
    – whuber
    Dec 5 at 14:35










  • Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
    – Matt Barstead
    Dec 10 at 2:29










  • @whuber, do I need to first convert my correlation matrix into a var-covariance matrix using r_x_iy_i * sd_x_i*sd_y_i and then go from there?
    – rnorouzian
    Dec 10 at 4:52












  • That's one approach, because it reduces your problem to one with a known, explicit solution.
    – whuber
    Dec 10 at 15:13













up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1






This question already has an answer here:




  • Is there a way to use the covariance matrix to find coefficients for multiple regression?

    1 answer




I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3) and a dependent variable (y) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?



I highly appreciate an R demonstration.



ns    <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)

r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)

rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))









share|cite|improve this question
















This question already has an answer here:




  • Is there a way to use the covariance matrix to find coefficients for multiple regression?

    1 answer




I was wondering how mathematically is it possible to run a full regression analysis between 3 predictors (x1 x2 x3) and a dependent variable (y) by only knowing the: Means, Ns, SDs, and the Correlations between all these 4 variables (without the original data)?



I highly appreciate an R demonstration.



ns    <- c(273, 273, 273, 273)
means <- c(15.4, 7.1, 3.5, 6.2)
sds <- c(3.4, 0.9, 1.5, 1.4)

r <- matrix( c(
1.0, .57, -.4, .48,
.57, 1.0, -.61, .66,
-.4, -.61, 1.0, -.68,
.48, .66, -.68, 1.0), 4)

rownames(r) <- colnames(r) <- c('y', paste0('x', 1:3))




This question already has an answer here:




  • Is there a way to use the covariance matrix to find coefficients for multiple regression?

    1 answer








r regression multiple-regression regression-coefficients






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 10 at 4:15

























asked Dec 5 at 3:49









rnorouzian

6741820




6741820




marked as duplicate by whuber r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by whuber r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Dec 5 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.










  • 1




    You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
    – whuber
    Dec 5 at 14:35










  • Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
    – Matt Barstead
    Dec 10 at 2:29










  • @whuber, do I need to first convert my correlation matrix into a var-covariance matrix using r_x_iy_i * sd_x_i*sd_y_i and then go from there?
    – rnorouzian
    Dec 10 at 4:52












  • That's one approach, because it reduces your problem to one with a known, explicit solution.
    – whuber
    Dec 10 at 15:13














  • 1




    You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
    – whuber
    Dec 5 at 14:35










  • Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
    – Matt Barstead
    Dec 10 at 2:29










  • @whuber, do I need to first convert my correlation matrix into a var-covariance matrix using r_x_iy_i * sd_x_i*sd_y_i and then go from there?
    – rnorouzian
    Dec 10 at 4:52












  • That's one approach, because it reduces your problem to one with a known, explicit solution.
    – whuber
    Dec 10 at 15:13








1




1




You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber
Dec 5 at 14:35




You cannot run a truly "full" regression analysis with just these statistics, because you will not be able to construct residuals and perform regression diagnostics that depend on them. You are limited to making and testing parameter estimates.
– whuber
Dec 5 at 14:35












Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29




Structural equation modeling is the most obvious answer in my mind, which is more or less the math provided in the answer below. Most SEM programs will be able to accept your inputs (all need to be able to estimate a covariance matrix in the end).
– Matt Barstead
Dec 10 at 2:29












@whuber, do I need to first convert my correlation matrix into a var-covariance matrix using r_x_iy_i * sd_x_i*sd_y_i and then go from there?
– rnorouzian
Dec 10 at 4:52






@whuber, do I need to first convert my correlation matrix into a var-covariance matrix using r_x_iy_i * sd_x_i*sd_y_i and then go from there?
– rnorouzian
Dec 10 at 4:52














That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber
Dec 10 at 15:13




That's one approach, because it reduces your problem to one with a known, explicit solution.
– whuber
Dec 10 at 15:13










1 Answer
1






active

oldest

votes

















up vote
4
down vote














  1. Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)


$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$




  1. Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.


  2. Using the information about means, you can get the estimate of intercept.







share|cite|improve this answer



















  • 1




    I do not know how to use R.
    – user158565
    Dec 5 at 5:00






  • 1




    How does collinearity influence this?
    – PascalVKooten
    Dec 5 at 9:25










  • Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
    – user158565
    Dec 5 at 14:20


















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
4
down vote














  1. Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)


$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$




  1. Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.


  2. Using the information about means, you can get the estimate of intercept.







share|cite|improve this answer



















  • 1




    I do not know how to use R.
    – user158565
    Dec 5 at 5:00






  • 1




    How does collinearity influence this?
    – PascalVKooten
    Dec 5 at 9:25










  • Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
    – user158565
    Dec 5 at 14:20















up vote
4
down vote














  1. Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)


$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$




  1. Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.


  2. Using the information about means, you can get the estimate of intercept.







share|cite|improve this answer



















  • 1




    I do not know how to use R.
    – user158565
    Dec 5 at 5:00






  • 1




    How does collinearity influence this?
    – PascalVKooten
    Dec 5 at 9:25










  • Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
    – user158565
    Dec 5 at 14:20













up vote
4
down vote










up vote
4
down vote










  1. Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)


$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$




  1. Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.


  2. Using the information about means, you can get the estimate of intercept.







share|cite|improve this answer















  1. Based on that correlation matrix, you can estimate the standardized regression coefficients (excerpt intercept) by following. (suppose the first column is for y)


$$left(begin{matrix} 1.00 & -0.61 & 0.66\
-.61 & 1.00 & -0.68\
.66 & -.68 & 1.00end{matrix}right)^{-1}left(begin{matrix} 0.57\
-.40\
.48end{matrix}right)$$




  1. Combining standard deviations you can convert standardized regression coefficients into general regression coefficients.


  2. Using the information about means, you can get the estimate of intercept.








share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Dec 5 at 5:58

























answered Dec 5 at 4:29









user158565

5,1451318




5,1451318








  • 1




    I do not know how to use R.
    – user158565
    Dec 5 at 5:00






  • 1




    How does collinearity influence this?
    – PascalVKooten
    Dec 5 at 9:25










  • Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
    – user158565
    Dec 5 at 14:20














  • 1




    I do not know how to use R.
    – user158565
    Dec 5 at 5:00






  • 1




    How does collinearity influence this?
    – PascalVKooten
    Dec 5 at 9:25










  • Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
    – user158565
    Dec 5 at 14:20








1




1




I do not know how to use R.
– user158565
Dec 5 at 5:00




I do not know how to use R.
– user158565
Dec 5 at 5:00




1




1




How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25




How does collinearity influence this?
– PascalVKooten
Dec 5 at 9:25












Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20




Full collinearity ==> not exist of the inverse of that matrix in the answer, and the generalized inverse should be used and there are infinite number of estimates. Partial collinearity ==> the inverse of that matrix is unstable, i.e., the little change in X can result in tremendous change in estimate.
– user158565
Dec 5 at 14:20



Popular posts from this blog

flock() on closed filehandle LOCK_FILE at /usr/bin/apt-mirror

Mangá

Eduardo VII do Reino Unido