[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: "N.W. Galwey" <ngalwey@cyllene.uwa.edu.au>*Subject*: Re: Further problem with prediction*From*: "B. Cullis" <cullisb@wagsun.agric.nsw.gov.au>*Date*: Fri, 30 Oct 1998 08:29:38 +1100 (EST)*cc*: asreml@ram.chiswick.anprod.csiro.au*In-Reply-To*: <199810290505.NAA13300@cyllene.uwa.edu.au>*Sender*: owner-asreml@chiswick.anprod.csiro.au

Dear Nick, We have a similar analysis in the SPLines paper we are presenting in South Africa. It may be useful to read it and look at the attached .as file that produced the analysis. On Thu, 29 Oct 1998, N.W. Galwey wrote: > I would be grateful for feedback on the ASREML command file and output > embedded below. The objective is to determine the relationship between > field establishment (plants/m2) and grain yield (t/ha) in three varieties of > lupin. > > I'd like to know whether the output indicates that the model I am fitting is > appropriate. In particular: > > i. I put two starting values after the codeword AR with the intention of > getting first- and second-lag autocorrelations. Is this the correct method? yes I believe so. > > ii. If so, does the output indicate that the inclusion of second-lag > autocorrelation gives a significantly better fit? I am assuming that it > does because the Compnt/StndErr values for both AR terms are quite large > (>3). The most reliable way to do this is to drop the ar(2) parameter and examine the change in REMLlog lik. > > iii. Why are the gamma and component values for these two terms so > similar? Is this just coincidence? I presume you mean the coeffs for the row and column terms. this is no coincidence. They are the same. These are not components, but correlations. > > iv. I don't want to fit any column effects. So is the codeword IDEN > okay? How should I decide whether the identify matrix should be scaled? > No scaling is required we are fitting a separable model to the row.column terms. This is usually the error term as it uniquely defines the plots. The model is var(vec(e)) = \sigma^2 C_c x C_r where the C_i's are the correlation matrices for cols and rows. > v. Is it okay to fit a linear effect of the variate ESTAB (and its > interaction with Variety), to account for any overall trend, before > including the spline in the model? This must be done, otherwise you dont have a natural cubic smoothing spline (NCS). The spl() term only includes the nonlinear component. You dont need c(variety) in the random part, as this is only relevant for fixed effects. > > vi. Should the spline (and its interaction with Variety) be put in the > random effects model? If so, will the variance explained by the spline be > included in the error against which the linear trend is tested? > Definitely! Yes. But my comment above stills holds. Dropping linear terms implies you no longer are fitting a NCS. > vii. Am I right in thinking that it makes no difference to the fitted value > of a data point whether a term (e.g. spl(ESTAB)) is fitted as a fixed or a > random effect? > No. it must be a random effect. I make a few more points further down in the analysis which are relevant. > The desired outcome is a curve relating yield to establishment in each > variety, and it is therefore necessary to obtain predicted values. I would > like to avoid the complexities of setting up a .pin file, and the ASREML > manual suggests adding some "missing data" and fitting them with "missing > values" (p 86, version of 2 October 1998). However, it seems to me that > this approach won't work for a spatial analysis, as each observation must be > allocated to a row and column. I had hoped that the row and column values > would not influence the fitted value (just as the variogram is not > influenced by whether AR terms are fitted), but it turns out that they do > have an effect. Any suggestions? > see further down > __________________ > > 96wh14.as > > Restricted branching lupin - density trials > Variety 3 !A > TDENS 6 > Rep 5 > PLOT 90 > COL 2 > ROW 101 > THA > ESTAB > > C:\DOCS\Dracup_M\96WH14\96WH14.dat !skip 1 !maxit 20 > > THA ~ mu mv c(Variety) ESTAB c(Variety).ESTAB, > !r Rep spl(ESTAB) c(Variety).spl(ESTAB) > 1 2 > ROW ROW AR .1 .1 > COL COL IDEN > ______________________ > 96wh14.asr > > ASREML [30 Sep 1998] Restricted branching lupin - density trials > > 10/29/98 12:09:43.22 8.00 Mbyte c:\DOCS\Dracup_M\96WH14\96wh14.as > QUALIFIERS: !skip 1 > Reading C:\DOCS\Dracup_M\96WH14\96WH14.dat FREE FORMAT skipping 1 > lines > Univariate analysis of THA > Using 108 records [of 108 read from 108 lines of > C:\DOCS\Dracup_M\96W] > Model term Size Type COL Minimum Mean Maximum #zero #miss > 1 Variety 3 Factor 1 1 2.0000 3 0 18 > 2 TDENS 6 Factor 2 25 71.6667 125 0 18 > WARNING - More levels in factor than expected > 3 Rep 5 Factor 3 1 3.0000 5 0 18 > 4 PLOT 90 Factor 4 1 45.5000 90 0 18 > 5 COL 2 Factor 5 1 1.5000 2 0 0 > 6 ROW 54 Factor 6 1 27.5000 54 0 0 > 7 THA 1 Variate 7 0.7530 1.014 1.428 0 20 > 8 ESTAB 1 Covariat 8 19.60 71.19 137.6 0 18 > 9 mu 1 Constant Term > 10 mv_estimates 20 Missing value > 11 c(Variety) 2 Factor 1 1 2.0000 3 0 18 > 12 c(Variety).E 2 Interaction 11 c(Vari: 2 8 ESTAB : 1 > 13 spl(ESTAB) 47 Spline 8 19.60 71.19 137.6 0 18 > 14 c(Variety).s 94 Interaction 11 c(Vari: 2 13 spl(ESTAB) : 47 > 54 AR=AutoR 0.10 0.10 > 2 identity > Forming 173 equations: 27 dense > Initial updates will be shrunk by factor 0.548 > LogL= 80.6926 S2= 0.11703E-01 82 df 0.10000 0.10000 0.10000 > 1.0000 0.10000 0.10000 > LogL= 104.813 S2= 0.11911E-01 82 df 0.16662 0.10000E-010.10000E-01 > 1.0000 0.21445 0.27837 > LogL= 116.546 S2= 0.13897E-01 82 df 0.30356 0.10000E-020.10000E-02 > 1.0000 0.30181 0.35295 > LogL= 121.136 S2= 0.15483E-01 82 df 0.40011 0.10000E-030.10000E-03 > 1.0000 0.35163 0.34838 > LogL= 122.124 S2= 0.14720E-01 82 df 0.44461 0.58089E-040.10000E-04 > 1.0000 0.35465 0.31151 > LogL= 122.722 S2= 0.18048E-01 82 df 0.42452 0.49216E-040.10000E-06 > 1.0000 0.34659 0.41749 > LogL= 122.719 S2= 0.20917E-01 82 df 0.36671 0.45774E-040.10000E-06 > 1.0000 0.40451 0.39840 > Final parameter values 0.56309 0.65929E-040.10000E-06 > 1.0000 0.37353 0.37798 > > Source Model terms Gamma Component Compnt/StndErr > Rep 5 5 0.563094 0.117781E-01 1.77 P > spl(ESTAB) 47 47 0.659291E-04 0.137902E-05 0.75 P > c(Variety).spl(ESTAB 94 94 0.100000E-06 0.209167E-08 1.86 B > Variance 108 82 1.00000 0.209167E-01 1.86 P > Residual AR=AutoR 54 0.373534 0.373534 3.20 U > Residual AR=AutoR 54 0.377980 0.377980 3.20 U ################################################ POINTS TO NOTE ############################################### 1. I wouldnt normally include Rep as a fixed effect in a spatial analysis! This would need some reasonable diagnostic to convince me that it was necessary. The gamma seems large so I am suspicious that there is something strange with the field layout. Were the replicates adjacent. Is the field layout correct. The data must be exhibiting large trend!! 2. the gamma for variety.spl() is zero. This could mean that ESTAB may need scaling or it is zero. Try rerunning the job with ESTAB /100. Gammas for spl() terms are scale dependant, just like regression coeff's so be careful. 3. There are too many knots in ESTAB (ie 47!). I presume this trial was designed as a density by variety trial, with target densities. I would try the target densities as input for the spline. This is an interesting, as yet unsolved problem. My pragmatic approach is to choose knot points for which there is coverage of the design space for the 'x' but only enough to be sensible. Marx and Eilers have done some work in this area but I haven't fully investigated the issue. I certainly dont think ASREML has it right yet. We (ie Robin and I) have some notes on how to fit models with more data than knots but it is not in ASREML yet. This data set is an exmaple of one where you should nominate knots and then use the results. The current strategy for these data would be to use the target density as a sensible 'x'. 4. Getting fitted splines!!!!!! Well this is tricky. I dont use pin files (sorry Arthur) I prefer to reconstruct the spline myself. The output in the .asr file is all you need, along with some code to generate the spline design matrix (Z_s) which we have, if you would like it. You can turn it into a GENSTAT proc if you like ie pick up the output from .asr and use GENSTAT to get the fitted spline. The .vrb file can be used to get prediction intervals. The output below gives the values of Z_su_s for spl() and variety.spl(). You simply add these to the linear part ie for variety 1 fitspl = 157.88 - .0355 - .001966*ESTAB +.000582*ESTAB + spl() + var1.spl() mu MERRIT ESTAB c(var).ESTAB spl() v1.spl() Ie there are bits from the usual linear part and then add the two spl() parts. 4. I get nervous about ROW and COL as you have it. I always do it as I have it in the attached file. The data is sorted into rows within cols and then I cant be worried about whether ASREML is fitting the model I want. Hope this helps cheers brian > WARN ING: Code B - fixed at a boundary (!GP) > C - Constrained by user (!CON) > S - Singular Information matrix > S means there is no information in the data for this parameter. > Very small components with Comp/SE ratios of zero sometimes indicate poor > scaling. Consider rescaling the design matrix in such cases > Fitted Spline 49 (X) for spl(ESTAB) > 19.600 21.800 24.400 27.050 28.200 29.800 32.800 37.400 > 41.200 42.900 44.000 45.200 46.600 48.400 50.350 52.400 > 54.200 56.000 57.700 59.000 60.600 63.350 64.733 69.900 > 71.400 72.800 76.000 79.500 80.800 82.400 85.000 87.267 > 89.400 91.600 93.400 95.600 98.600 100.000 103.800 105.800 > 111.000 115.267 116.900 118.800 121.000 122.700 128.400 130.800 > 137.600 > Fitted Spline 49 (Y) for spl(ESTAB) 1 > 0.028825 0.025985 0.022633 0.019228 0.017760 0.015734 0.011984 0.006362 > 0.001891 -0.000024 -0.001228 -0.002504 -0.003938 -0.005686 -0.007444 -0.009122 > -0.010445 -0.011620 -0.012588 -0.013231 -0.013901 -0.014735 -0.015003 -0.015187 > -0.015038 -0.014829 -0.014125 -0.013057 -0.012588 -0.011960 -0.010830 -0.009747 > -0.008657 -0.007466 -0.006443 -0.005138 -0.003267 -0.002361 0.000183 0.001561 > 0.005231 0.008293 0.009465 0.010819 0.012369 0.013552 0.017448 0.019085 > 0.023756 > ', , 0.03 > '',, ,' 0.02 > ' ,,,'' 0.01 > ',, ,,' 0.00 > ''',, ,,,'' 0.00 > ''',,,,,,,,,,,'''' -0.01 > -0.02 > -0.03 > Fitted Spline 49 (Y) for c(Variety).spl(ESTAB 1 > -0.74E-06 -0.58E-06 -0.40E-06 -0.23E-06 -0.16E-06 -0.58E-07 0.14E-06 0.50E-06 > 0.89E-06 0.11E-05 0.13E-05 0.15E-05 0.17E-05 0.20E-05 0.24E-05 0.29E-05 > 0.32E-05 0.35E-05 0.38E-05 0.39E-05 0.39E-05 0.37E-05 0.35E-05 0.20E-05 > 0.14E-05 0.73E-06 -0.97E-06 -0.30E-05 -0.37E-05 -0.45E-05 -0.57E-05 -0.65E-05 > -0.70E-05 -0.74E-05 -0.76E-05 -0.76E-05 -0.72E-05 -0.69E-05 -0.59E-05 -0.51E-05 > -0.26E-05 -0.24E-06 0.73E-06 0.19E-05 0.32E-05 0.42E-05 0.76E-05 0.91E-05 > 0.13E-04 > ''''''''''''''''''''''''''''''''''''''''''''''''' 0.00 > 0.00 > 0.00 > -0.01 > -0.01 > -0.01 > -0.01 > -0.01 > Fitted Spline 49 (Y) for c(Variety).spl(ESTAB 2 > -0.61E-05 -0.52E-05 -0.41E-05 -0.31E-05 -0.26E-05 -0.20E-05 -0.93E-06 0.40E-06 > 0.11E-05 0.13E-05 0.14E-05 0.16E-05 0.17E-05 0.17E-05 0.18E-05 0.17E-05 > 0.16E-05 0.15E-05 0.13E-05 0.12E-05 0.94E-06 0.53E-06 0.33E-06 -0.14E-06 > -0.18E-06 -0.17E-06 0.13E-07 0.44E-06 0.65E-06 0.93E-06 0.14E-05 0.19E-05 > 0.22E-05 0.26E-05 0.28E-05 0.29E-05 0.30E-05 0.30E-05 0.27E-05 0.24E-05 > 0.13E-05 0.12E-06 -0.38E-06 -0.10E-05 -0.18E-05 -0.24E-05 -0.46E-05 -0.55E-05 > -0.82E-05 > ''''''''''''''''''''''''''''''''''''''''''''''''' 0.00 > 0.00 > 0.00 > -0.01 > -0.01 > -0.01 > -0.01 > -0.01 > Solution Standard Error T-value T-prev > 12 c(Variety).ESTAB 2 3.46 3.46 [DF > F_inc F_all] > 1 0.582442E-03 0.385741E-03 1.51 > 2 -0.101474E-02 0.388952E-03 -2.61 -2.41 > 8 ESTAB 1 23.84 24.87 [DF > F_inc F_all] > 3 -0.196641E-02 0.394323E-03 -4.99 > 11 c(Variety) 2 0.37 2.08 0.5136E-01 [DF F_i > F_a SED] > MERRIT -0.305525E-01 0.302952E-01 -1.01 > 85SO46-37 0.607833E-01 0.298523E-01 2.04 1.73 > 10 mv_estimates 20 13.49 14.01 [DF > F_inc F_all] > 6 -0.882628 0.916597E-01 -9.63 > 7 -0.647879 0.941482E-01 -6.88 1.92 > 8 -1.07879 0.116429 -9.27 -2.83 > 9 -1.11726 0.122686 -9.11 -0.34 > 10 -1.10920 0.133636 -8.30 0.07 > 11 -1.12127 0.138845 -8.08 -0.10 > 12 -1.12293 0.143739 -7.81 -0.01 > 13 -1.12842 0.147031 -7.67 -0.05 > 14 -1.13130 0.149708 -7.56 -0.02 > 15 -1.13465 0.151712 -7.48 -0.03 > 16 -1.13715 0.153298 -7.42 -0.02 > 17 -1.13950 0.154532 -7.37 -0.02 > 18 -1.14145 0.155512 -7.34 -0.02 > 19 -1.14317 0.156290 -7.31 -0.01 > 20 -1.14464 0.156913 -7.29 -0.01 > 21 -1.14592 0.157416 -7.28 -0.01 > 22 -1.14703 0.157823 -7.27 -0.01 > 23 -1.14799 0.158154 -7.26 -0.01 > 24 -1.14882 0.158426 -7.25 -0.01 > 25 -1.14953 0.158650 -7.25 -0.01 > 9 mu 1 157.88 287.57 [DF > F_inc F_all] > 26 1.15412 0.680586E-01 16.96 > 3 Rep 5 effects fitted > 13 spl(ESTAB) 47 effects fitted > 14 c(Variety).spl(ESTAB 94 effects fitted > Finished: 12:10:59.39 LogL Converged > > _____________________________________________________________________ > N.W. Galwey, > Faculty of Agriculture, > University of Western Australia, > Nedlands, WA 6709, Australia. > > Tel.: +61 9 380 1959 (direct line) > +61 9 380 2554 (switchboard) > Fax: +61 9 380 1108 > > ............................................................................... Brian Cullis Tel: 02 6938 1855 NSW Agriculture Fax: 02 6938 1809 Wagga Agricultural Institute email: brian.cullis@agric.nsw.gov.au Pine Gully Rd WAGGA WAGGA NSW 2650

met cowl num 1 row 27 column 6 variety 9 !A seedrate 7 !I lincol 1 linrow 1 rcol 6 rrow 27 fixrow 4 x 1 site 11 yield 1 splrow 1 trt 62 !A cowlred.asd !skip 1 !dense 99 yield ~ mu mv c(site) x site.x, site|2/lincol site|5/lincol, site|6/lincol site|8/lincol, site|9/lincol site|10/lincol, site|3/linrow site|5/linrow, site|7/linrow site|8/linrow, site|2/c(fixrow) site|3/c(fixrow), site|5/c(fixrow) site|7/c(fixrow), site|8/c(fixrow) site|9/c(fixrow) site|11/c(fixrow), !r spl(x) .00349 , variety .002 va.x .00046 va.spl(x) .000757, si.va .018 si.va.x .00021, site.spl(x) .000450 -si.va.spl(x) .00002, -seedrate .001 -va.se .001 -si.se .001 -si.va.se .000113, site|3/rrow .001 site|6/rrow .0017, site|7/rrow .00046, site|9/rrow .0045, site|1/rcol .01 site|2/rcol .025 site|3/rcol .082, -site|7/rcol .0031 site|9/rcol .0042 site|10/rcol .00043, -rcol -rrow -rcol.rrow, si|5/rc.rr .0042, si|7/rc.rr .013, si|8/rc.rr .013, -si|10/rc.rr .0024 11 2 2 6 column AR .15 !S2=.034 16 row AR .21 6 column AR .16 !S2=.039 16 row AR .31 4 column AR .35 !S2=.067 16 row AR .17 6 column AR .04 !S2=.035 16 row AR .08 6 column AR .07 !S2=.018 16 row AR .58 6 column AR .22 !S2=.024 27 row AR .34 6 column AR .51 !S2=.041 27 row AR .92 6 column AR .33 !S2=.020 27 row AR .74 6 column AR -.02 !S2=.018 27 row AR .24 6 column AR .18 !S2=.0088 27 row AR .53 6 column AR .082 !S2=.054 27 row AR .21 variety 2 2 0 CORR .67 .0053 .000409 9 0 0 si.va 2 2 0 CORR .995 .018 .000178 99 0 0

**References**:**Further problem with prediction***From:*"N.W. Galwey" <ngalwey@cyllene.uwa.edu.au>

- Prev by Date:
**Re: Thanks and another question** - Next by Date:
**Re: Further problem with prediction** - Prev by thread:
**Further problem with prediction** - Next by thread:
**Re: Further problem with prediction** - Index(es):