-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider change to reference 346 #1032
Comments
The reference numbers can change with each build of the manuscript, so I'm noting here this comment refers to: Sensitive detection of rare disease-associated cell subsets via representation learning We'd welcome a small pull request to clarify this. The authors do claim they have a convolutional neural network, so I'd have to reread their methods to see what they're doing. |
Yes, I realized that the reference number might change after I posted my comment. To be specific I looked at the code here since the link in the paper is broken: https://github.com/eiriniar/CellCnn/blob/master/cellCnn/model.py A 1X1 convolution with channels last is equivalent to a dense layer, I guess I'll leave it up to you to decide whether or not to call that convolutions or not. It seems I was mistaken about the sorting and each filter will be sorted independently. |
It sounds like they have a special case of convolution similar to this part of the Transformer architecture in the Attention Is All You Need paper: Is that right? I want to differentiate between this being a special case of convolution that should be clarified in the text versus this not being a convolution in any sense before making an edit. |
A 1X1 convolution is a common step in convolutional networks, but it is simply a dimensionality reduction technique and mathematically equivalent to a dense layer. While it's used in convolutional models, it alone doesn't make the model convolutional. It is used instead of a dense because a dense requires the input dimension to be known while a 1X1 conv layer does not. This post goes into more detail: https://stackoverflow.com/questions/39366271/for-what-reason-convolution-1x1-is-used-in-deep-neural-networks. The authors of this paper likely used the 1x1 conv so that they can have a variable number of cells as input. A dense layer would have required knowing the exact number of cells. So they could maybe train a model where each sample had 100 cells, and then apply it to problems with 1000 cells and the model would still run. However, regardless of how many cells their model is run on it is always only featurizing a single cell at a time, which isn't a convolution. While that is a clever way to perform multiple instance learning, the most natural way is with ragged tensors as we did (https://github.com/OmnesRes/ATGC/tree/method_paper). As an aside you may want to look at our MIL paper as I view it as a large step in bringing deep learning to genomics data: https://www.biorxiv.org/content/10.1101/2020.08.05.237206v4 |
I suspect this is the version of the code closest / just proximal to the publication. build_model function within that code base here clearly defines the model https://github.com/eiriniar/CellCnn/blob/e0464f64ce60fc616c2bb7724134b306910bb43d/cellCnn/model.py for convenience I have refactored that build below to the core steps (removing regularization, filtering, etc) just to get at the cores steps below with the key point being that the conv1d function The authors correctly understood that this axis is permutation invariant and as such did 1x / kernel_size=1 (i.e. the order of the cells means nothing). This is obvious when you consider both the source data along with the random sub-sampling strategy that is used to fix the number of cells in that axis in their implementation. However, as my colleague above is suggesting - the use of 1x 1d conv in this manner is really not "convolution" in my humble opinion. A simple dense layer is accomplishing the identical task and is much more straightforward to understand, In fact, this notion is really much more in line with how this is presented in the manuscript above (Fig 1b) particularly where they show "filter 1" as a vector of weights being applied to cell feature vectors and resulting in a single value for that cell filter combination. I would like to just point out that we believe that the results are all likely valid as this whole discussion is just around semantics. The two operations produce the same result. The bottom line is that to us the notion of convolution does not really apply as the cells are permutation invariant and the cell feature axis are all "independent" variables.
|
Yes, I agree. I seem to recall others using the term "convolution" to describe this same operation on gene expression vectors around this time (maybe a talk I saw on https://github.com/KnowEnG-Research/knocknet), but the main point is that currently saying "CNN" alone could be confusing or misleading. Here's the current content of https://github.com/greenelab/deep-review/blob/master/content/04.study.md
If either of you would like to make a small pull request - for example, replacing "CNN" with "model" or "NN" and adding a new sentence at the end explaining that the model uses convolutional layers but in an atypical way - I'll review it. You could add yourself to https://github.com/greenelab/deep-review/blob/master/content/08.methods.md#acknowledgements as well. |
To me when I hear CNN that implies that there is some sort of order to the data, for example pixels in an image, or nucleotides in DNA, etc. Looking at the name of their model, CellCNN, my first inclination would be that they somehow imparted order to the cells and convolved over multiple cells at a time. However they essentially have tabular data and applied MIL to the data. MIL is currently severely underutilized in biology so I think the authors deserve credit for being one of the first to apply it to this field (in fact your manuscript appears to have no mention of MIL at the moment), so I'll think about how to get that point across, but mentioning MIL may require a new term in your glossary? |
I suggest separating the two possible revisions and pull requests. Clarifying the description of CellCNN would require only a new sentence or two, which we can review and merge quickly. Writing about multi-instance learning is also a good idea, and you are correct that is it a surprising omission in the current manuscript. There have been applications in biomedical imaging too that could be discussed. However, that would be a bigger change and require more writing and editing. |
If you add a section on MIL you may want to include our recent publication: https://www.nature.com/articles/s41551-023-01120-3 |
While the authors claim to be doing a convolution, even going as far as naming their tool CellCnn, they are doing a stride of 1 which is essentially a dense layer. Further, you say they used the "most discriminative filters" but the tensor sort they do I believe will result in sorting by the first filter, then second, etc. So essentially the sorting is almost entirely driven by the first filter.
The text was updated successfully, but these errors were encountered: