Saturday, April 07, 2007

random effects

Dear Tom,
my apologies for the long delay between you request and this answer.
First of all, why not include precipitation ("Precip") as a main effect too?

Using libray "nlme" and lme():
Random intercepts per station:
random= ~1|station
Don't forget the vertical bar!
Including random slope:
random= ~Precip|station
random= ~1+Precip|station

Make sure that all stations have different names, otherwise you run the risk that stations with identical names at different distances will be considered as a single station.

You could also consider to use the updated library now called "lme4".
Then the random effects are simply included as extra terms in the model statement.

Cheers, Tom

Wednesday, February 21, 2007

random components of lme-model

Hi everyone,

I'm currently analysing a large dataset with measurements of nutrients in the Dutch marine systems in R. I have a variable "OffshoreDist" representing the distance from the sampling location to the coast. I also calculated the total amount of precipitation over The Netherlands and called this variable "Precip".What I want to do is link mean annual dissolved inorganic nitrogen (DIN) to the amount of precipitation. I expect that this relationship is dependent on the offshore distance. Distance classes can contain 1 or more sampling stations.

That's why I propose these two models with different random components:
model1 <- lme(log(DIN) ~ Precip:factor(OffshoreDist) + factor(OffshoreDist),
random= ~1station)
model2 <- lme(log(DIN) ~ Precip:factor(OffshoreDist) + factor(OffshoreDist),
random= ~Precipstation)

Are these the right models for random intercepts per station (model1) and random slope+intercept per station (model2)? Or should this be something like:

random = ~ factor(OffshoreDist) -1 station (model 1)
random = ~ Precip:factor(OffshoreDist)station (model 2)

I've read Pinheiro & Bates, 2000 which is a good reference, but explanation of the formulae only by means of examples does not seem to work for me.

Thanks in advance.


Tom Van Engeland

Wednesday, February 07, 2007

moved to new blogger

I've moved the blog into the new google template. Everyone logging in will be invited to move her/his account too. A nice aspect of the change is that the blog can now be searched easily!
Cheers, Tom

Tuesday, January 23, 2007

F lmer

Hi Marie,
do you really need to use the quasibinomial distribution? Is there evidence of overdispersion in the maximal model you fit?

If not, then you can use the binomial distribution.
That allows you to use likelihood-ratio tests for the fixed and random effects.

If there is clear overdispersion, then I guess the best thing to do would be resampling according to the model, if that works for the quasibinomial "distribution". You need to use mcmcsamp() for that.
Otherwise, it will become hand-work to get some of those hotly debated F-ratios out....
Cheers, Tom

lmer P-value

hi everybody
Do you know the simplest and correct way to produce P-values of lmer and of the ANOVA (F test) on the lmer model in R?
My lmer model is of this form:
I read an extensive eplanation by Douglas Bates on R help archive named "[R] lmer, p-values and all that". However, nothing is mentionned regarding Laplace. So i do not know what to do.
Could someone give me advice?
many thanks

Tuesday, January 16, 2007

Longitudinal Data

Finally, the participants in the course Longitudinal and Incomplete Data have been invited. Transcripts for R of the SAS code presented during the course will be added to the weblog soon.
Cheers, Tom

Loglinear models

Hi Nicola,
the best manual is, based on the book by Agresti.
You can basically use glm with a poisson link for log-linear models, I guess it is also discussed in the Venables and Ripley book.
Good luck, Tom

Loglin models and contingency tables

Hi Tom,

I'm trying to use loglinear models to investigate contingency tables with frequencies, but I can't figure out how to get the standardized residuals, odds ratios etc. Also the help function doesn't mention them, and the R-manuals I have don't even mention loglin models at all.
Do I have to calculate them by hand or is there perhaps a good manual on loglin models?
Thanks in advance!

ps I have a 3-way table with one binary dependent factor and 2 explanatory factors with 3 levels each.

Friday, January 05, 2007

lmer, random effect with four levels

Dear Amir,
the "place" random effect in your dataset has only four levels, and they are nested within plants. The error which is returned states that the variance covariance matrix is singular, meaning that variation in intercept, slope and their covariance cannot be fitted. If you fit the random effect as (1|place), you will read that the random effect variance is effectively zero.
It does not make sense to include place both as fixed and as random, as in you second to fourth models. Since the number of levels of the random effects is very low, you could even just model them as fixed, using glm. The idea of random effects is that they are a representative sample from a large population, which is hard to defend with just two levels sampled per group!
More importantly, the distribution of your data does not look poisson at al. There are many zero observations. I would convert the counts to presence/absence data and analyse those as bernoulli probabilities. That works to some extent:


I've added you models to your post, so other visitors to this blog can also read and respond to your question,
all the best, Tom

Thursday, January 04, 2007

Mixed effects model

Beste Tom,
Tijdens de laatste GLM cursus hebben we samen een model gemaakt voor de
analyse van een dataset. Ik probeer het model nu te gebruiken, maar krijg
steeds fout meldingen in R. Helaas heb ik de R file van de analyse die we
samen gedaan hebben op een verkeerde wijze opgeslagen, waardoor ik niet
meer terug kan halen wat ik fout doe. Kan je misschien kijken wat er
mis gaat?

Ik stuur de dataset en de Tin R file met het model die we gebruikt hebben
als attachment naar je e-mail account. De dataset gaat over voorkomen van insecten in 4 plantages, twee van ieder plant soort (Guava en Eucalyptus), gedurende 2
jaar. De data is verzameld door bomen in de plantages te bemonsteren en
daarom nemen we de zoektijd per boom als weegfactor voor de datapunt. Ik
probeer erachter te komen of de plantsoort effect heeft op het voorkomen
van de verschillende insecten groepen.

Alvast bedankt,

Here's Amir's code:

setwd("D:/Amir-18-9-06/field-data/All data/analysis")