Weird dateasnumeric format
Re: Weird dateasnumeric format
That is the right paper, and yes the data is from 2005.
I'm not sure why I got such a different value from LibreOffice  is that really the solution?
Good point about the precision.
I'm not sure why I got such a different value from LibreOffice — is that really the solution?

Good point about the precision.
Good point about the precision.
Re: Weird dateasnumeric format
Yeah, it's got to start at 0 (no juveniles at the beginning of the season). You may or may not reach a mean JP of 1, depending on whether the last juveniles depart alone or with adults, but it seems sensible to have 1 as another asymptote.
I'm using logistic regression to do it properly, but I'd also like to compare these curves if possible.
Re: Weird dateasnumeric format
Yes, I just doublechecked and if I put 01/06/2005 as a date in LibreOffice (version 6.4.7.2) then convert to numeric I get 38504. And 42280 comes out as 03/10/15. Checking the options, the zero date is 30/12/1899. The rest is presumably the decimal precision error shpalman identified.
I'm not sure why I got such a different value from LibreOffice  is that really the solution?
Good point about the precision.
Still, if I just need to work out an integer offset for Julian day of year I think this is doable. I'll refit the original quadratic in R.
Thank you, hivemind!
Re: Weird dateasnumeric format
That's almost certainly it. If you rerun the linear equations making the x coefficient +/ 1 unit in the leasyt significant place you get:
T unit exact +unit
20 38144 38447 38754
0 38175 38577 38987
0 38492 38621 38750
3 38459 38623 38788
and 38548 is in all those intervals.
Re: Weird dateasnumeric format
So here's my scrape of the data from the image, LibreOffice's "I wanted a curve so I made one with math" fit as the solid line with the equation displayed on the graph, and the crosses which follow that line are plotted using the equation given above to check that you really do need all those stupid digits.

Using LibreOffice's zero day convention (30/12/1899) and assuming the graph is 2005 I get something similar to the graph with

0.0004395799826*(date)^2−33.885347*date+653019.2

but if you don't use all those decimal places it is nowhere near.
Using LibreOffice's zero day convention (30/12/1899) and assuming the graph is 2005 I get something similar to the graph with
0.0004395799826*(date)^233.885347*date+653019.2
but if you don't use all those decimal places it is nowhere near.
Re: Weird dateasnumeric format
That's awesome — thank you so much!
Using LibreOffice's zero day convention (30/12/1899) and assuming the graph is 2005 I get something similar to the graph with
0.0004395799826*(date)^233.885347*date+653019.2
but if you don't use all those decimal places it is nowhere near.
Re: Weird dateasnumeric format
It still looks like a straight line starting at some point on the x axis fits better to me.
And a gaussian* convolved with a line or a step function starting at some point in time would be a much better motivated model (i.e. a "cloud" of birds arriving at some point in time, with a few ahead of the cloud and some stragglers, or the equivalent for what's being measured).
A quadratic needs some kind of explanation for why the distribution should be quadratic before you try to fit it.
* Because all soft edge things look like a gaussian convolved with something if you squint a bit.
And a gaussian* convolved with a line or a step function starting at some point in time would be a much better motivated model (i.e. a "cloud" of birds arriving at some point in time, with a few ahead of the cloud and some stragglers, or the equivalent for what's being measured).
A quadratic needs some kind of explanation for why the distribution should be quadratic before you try to fit it.
* Because all soft edge things look like a gaussian convolved with something if you squint a bit.
Re: Weird dateasnumeric format
The data is really representing the ratio between two (very) roughly normal distributions centred on different dates  adults arrive in Iceland, successful fledging happens in a staggered way, then adults leave before juveniles.
I don't think a quadratic is the best way to model that, but you would expect a curve that trundles along near zero for a while. I think it's fine for descriptive, if not predictive, purposes. (The godwit is one of few species in that paper where all the birds are breeding in Iceland, rather than being joined by migrants from more northerly populations in Greenland and Canada.)
At some point I can play around with simulations, but I've got to finish writing, practise and record a conference presentation by the end of the day, so y'know
I don't think a quadratic is the best way to model that, but you would expect a curve that trundles along near zero for a while. I think it's fine for descriptive, if not predictive, purposes. (The godwit is one of few species in that paper where all the birds are breeding in Iceland, rather than being joined by migrants from more northerly populations in Greenland and Canada.)

At some point I can play around with simulations, but I've got to finish writing, practise and record a conference presentation by the end of the day, so y'know…
At some point I can play around with simulations, but I've got to finish writing, practise and record a conference presentation by the end of the day, so y'know
Re: Weird dateasnumeric format
This is so f.cking important.
And a gaussian* convolved with a line or a step function starting at some point in time would be a much better motivated model (i.e. a "cloud" of birds arriving at some point in time, with a few ahead of the cloud and some stragglers, or the equivalent for what's being measured).
A quadratic needs some kind of explanation for why the distribution should be quadratic before you try to fit it.
* Because all soft edge things look like a gaussian convolved with something if you squint a bit.
Re: Weird dateasnumeric format
A step function smoothed with a gaussian would give you a function that goes from zero to one with some width, and only requires two parameters to be fit rather than three for a quadratic.Bird on a Fire wrote: ↑Fri Oct 01, 2021 11:54 amThe data is really representing the ratio between two (very) roughly normal distributions centred on different dates  adults arrive in Iceland, successful fledging happens in a staggered way, then adults leave before juveniles.
I don't think a quadratic is the best way to model that, but you would expect a curve that trundles along near zero for a while. I think it's fine for descriptive, if not predictive, purposes. (The godwit is one of few species in that paper where all the birds are breeding in Iceland, rather than being joined by migrants from more northerly populations in Greenland and Canada.)
At some point I can play around with simulations, but I've got to finish writing, practise and record a conference presentation by the end of the day, so y'know
That's not a model, it's just the simplest description of the measure that obeys the same constraints as the measure (can't be less than zero, can't go above one).
Re: Weird dateasnumeric format
That's the error function isn't it? Or rather, (1+erf[(x−x0)/w])/2 if you want to go from 0 to 1 around x0 and your width parameter is w.

https://docs.scipy.org/doc/scipy/refere ... l.erf.html

I'm sure there's an implementation in whatever mathematical software you're using.
I don't think a quadratic is the best way to model that, but you would expect a curve that trundles along near zero for a while. I think it's fine for descriptive, if not predictive, purposes. (The godwit is one of few species in that paper where all the birds are breeding in Iceland, rather than being joined by migrants from more northerly populations in Greenland and Canada.)
At some point I can play around with simulations, but I've got to finish writing, practise and record a conference presentation by the end of the day, so y'know
That's not a model, it's just the simplest description of the measure that obeys the same constraints as the measure (can't be less than zero, can't go above one).
https://docs.scipy.org/doc/scipy/refere ... l.erf.html
I'm sure there's an implementation in whatever mathematical software you're using.
Re: Weird dateasnumeric format
Yeah, that's the one.
I don't think a quadratic is the best way to model that, but you would expect a curve that trundles along near zero for a while. I think it's fine for descriptive, if not predictive, purposes. (The godwit is one of few species in that paper where all the birds are breeding in Iceland, rather than being joined by migrants from more northerly populations in Greenland and Canada.)
At some point I can play around with simulations, but I've got to finish writing, practise and record a conference presentation by the end of the day, so y'know
That's not a model, it's just the simplest description of the measure that obeys the same constraints as the measure (can't be less than zero, can't go above one).
https://docs.scipy.org/doc/scipy/refere ... l.erf.html
I'm sure there's an implementation in whatever mathematical software you're using.
Re: Weird dateasnumeric format
I hesitate to use the phrase "Gompertz function" but it *is* population dynamics.
Can the value exceed 1?
Re: Weird dateasnumeric format
No. It's the proportion of birds seen which are juveniles, so it must be a rational number between 0 and 1 inclusive, further limited by the maximum possible number of birds you can count (whether due to counting ability or the fact that there's only so many birds in the world).
Re: Weird dateasnumeric format
Millennie Al wrote: ↑Sat Oct 02, 2021 4:30 amNo. It's the proportion of birds seen which are juveniles, so it must be a rational number between 0 and 1 inclusive, further limited by the maximum possible number of birds you can count (whether due to counting ability or the fact that there's only so many birds in the world).
I thought it had to be something like that, so any sensible function that tells you anything useful would have to also fit in that range.
and which is why I mentioned the Gompertz function.
https://en.m.wikipedia.org/wiki/Gompertz_function
Re: Weird dateasnumeric format
Fitting the error function centres it on the 12th of August with a width* of 14.7 days.
*  or 10.4 days if you multiply by sqrt(2) in the definition, i.e. maybe it's more "correct" to do 0.5*(1+erf([xx0]/[sqrt(2)*w]))
Re: Weird dateasnumeric format
That's using the simple logistic curve, 1/(1+exp(((dd0)/w))), and d0 is still the 12th of August but its width parameter is 6.2 days. Of course you can't quantitatively compare the width parameters between the erf and logistic models.
Re: Weird dateasnumeric format
Yeah there's not much difference; I wouldn't try to make the model more complicated than this.
Re: Weird dateasnumeric format
Since there was no queue at the petrol station and no shortages at the supermarket I had time also to check the quadratic fit:
So the fitting of a(dd0)^2+b(dd0)+c shifts d0 to 7th of August, and gives a=0.0004197306249464087, b=0.025756320921000418, c=0.33376244249276016, so the coefficient for the second order term is still roughly what it is in the paper.
This shifted version is a lot less sensitive, since I also plotted it with the coefficients rounded off and you can see it's more or less the same. Just that it's not at all physically reasonable to use this model.
So the fitting of a(dd0)^2+b(dd0)+c shifts d0 to 7th of August, and gives a=0.0004197306249464087, b=0.025756320921000418, c=0.33376244249276016, so the coefficient for the second order term is still roughly what it is in the paper.
This shifted version is a lot less sensitive, since I also plotted it with the coefficients rounded off and you can see it's more or less the same. Just that it's not at all physically reasonable to use this model.
Re: Weird dateasnumeric format
If we're getting around to discussing how the data should be graphed, I'd very much agree with the sentiment already expressed that it should depend on the underlying processes rather than just looking for stuff which fits (though, of course, that strategy does occasionally reveal some underlying mechanism that was previously unknown). So here's a trivial example. Suppose we have obsevations whereby starting on day 1, one adult arrives per day. Starting a day later, one juvenile arrives per day. When 16 adults have arrived, they stop arriving and start leaving at one per day. The same happens with the juveniles. This can be shown in this simple graph:
It may be that the proportion of juveniles is inherently significant. For example, maybe parents feed their offspring, so a low proportion of juveniles implies food shortage, while an unusually high proportion implies a good year for breeding (or heavier predation on adults).
If we then decide to graph JP instead (which is juvenile/(adult + juvenile)), we get:
which looks nice and complicated, but tells us a lot less than the simple graph. It may be that the proportion of juveniles is inherently significant. For example, maybe parents feed their offspring, so a low proportion of juveniles implies food shortage, while an unusually high proportion implies a good year for breeding (or heavier predation on adults).
Re: Weird dateasnumeric format
If we're getting around to discussing how the data should be graphed, I'd very much agree with the sentiment already expressed that it should depend on the underlying processes rather than just looking for stuff which fits (though, of course, that strategy does occasionally reveal some underlying mechanism that was previously unknown). So here's a trivial example. Suppose we have obsevations whereby starting on day 1, one adult arrives per day. Starting a day later, one juvenile arrives per day. When 16 adults have arrived, they stop arriving and start leaving at one per day. The same happens with the juveniles. This can be shown in this simple graph...
Here's what you get with two normal distributions with peak height of 1 and the same width. The error function and logistic curves give roughly the right behaviour but the width parameters are related to both the width and the spacing of the two Gaussians. And then it depends if the peak height should be fixed, normalized so that the area is fixed, or variable, or what.
molto tricky