-
Notifications
You must be signed in to change notification settings - Fork 37
Empirical protein models like WAG+F with estimated frequencies are not supported #132
Comments
The ability to estimate frequencies seems like a good idea, but I don’t have a strong opinion about the syntax. I’m guessing others might, though!
Cheers,
Jeremy
… On Jul 31, 2018, at 2:14 PM, Benjamin Redelings ***@***.***> wrote:
Right now fnWAG() uses the fixed frequencies estimated in the WAG paper, and produces a fixed Q matrix with no parameters.
However, what people usually do is to estimate the frequencies, while fixing the symmetric exchangabilities. This would be easy to code, the only question is what kind of syntax we would want and what to name things.
Basically we want something like fnWAG() has the current behavior, but fnWAG(pi) uses frequencies in pi. Then we could place a dirichlet distribution (or something) on pi.
Also, technically, this is a GTR model, with exchangabilities supplied by WAG, and frequencies pi being estimated. If the GTR model could be changed to take a symmetric matrix, then we could make fnWAG() just return the symmetric matrix, and WAG+F would be something like fnGTR(fnWAG(),pi).
A third approach (which seems to work so far) is to define fnWAG(pi) to always take a frequency vector. We then add a fnWAG_freq() to yield the fixed frequencies from the WAG paper. Users would the write fnWAG(pi) to estimate frequencies pi, and would write fnWAG(fnWAG_freq()) to use the fixed frequencies.
Since estimating frequencies is more common than using the fixed frequencies, I would recommed something like approach 3.
Thoughts?
P.S. Here is a case where someone wants to estimate the amino-acid frequencies, although not with the WAG - https://groups.google.com/forum/#!topic/revbayes-users/cmhwuYklecg <https://groups.google.com/forum/#!topic/revbayes-users/cmhwuYklecg>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <revbayes/revbayes#132>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAyDikPGtdZpEwbgCUeQqTkqKzf36nhZks5uMKyDgaJpZM4VpDXM>.
|
(I emailed this to Ben, but I guess GitHub didn't add it to the thread) Personally, I like Option 1 the best, since it wouldn't require new users to learn "special" functions to design their models. If we want to allow all empirical rate matrices to accept pi/er parameters, that might require some deeper redesign/reorganization of the empirical rate matrix family. So I'd vote to hold off on that for now. A variant on Option 2 would be to add a helper function that supplies various empirical rate matrix values, e.g.
What do you think? |
Hi Michael, I didn't see your e-mail, just the github post. Anyway, yes, it does seem like Option 1 is nicest and easiest to guess or learn. Does RevBayes allow different functions to have the same name but different numbers of arguments? Alternatively, does RevBayes allow functions to have default values for parameters (Option 3)? If either is true, then I think I see how to implement this. Your variant on Option 2 is interesting. I like the option to use the bf_WAG but put a prior on it. |
Now that #130 has been fixed, fnWAG() uses the fixed frequencies estimated in the WAG paper, and produces a fixed Q matrix with no parameters.
However, what people usually do is to estimate the frequencies, while fixing the symmetric exchangabilities. This would be easy to code, the only question is what kind of syntax we would want and what to name things.
Basically we want something like fnWAG() has the current behavior, but fnWAG(pi) uses frequencies in pi. Then we could place a dirichlet distribution (or something) on pi.
Also, technically, this is a GTR model, with exchangabilities supplied by WAG, and frequencies pi being estimated. If the GTR model could be changed to take a symmetric matrix, then we could make fnWAG() just return the symmetric matrix, and WAG+F would be something like fnGTR(fnWAG(),pi).
A third approach (which seems to work so far) is to define fnWAG(pi) to always take a frequency vector. We then add a fnWAG_freq() to yield the fixed frequencies from the WAG paper. Users would the write fnWAG(pi) to estimate frequencies pi, and would write fnWAG(fnWAG_freq()) to use the fixed frequencies.
Since estimating frequencies is more common than using the fixed frequencies, I would recommed something like approach 3. If revbayes functions support default values for parameters, we could make fnWAG_freq() to be the default_value of pi for fnWAG(pi), which would be pretty nice.
Thoughts?
P.S. Here is a case where someone wants to estimate the amino-acid frequencies, although not with the WAG - https://groups.google.com/forum/#!topic/revbayes-users/cmhwuYklecg
The text was updated successfully, but these errors were encountered: