I want to create a deep learning model, I know how to do so with simple samples where each row of dataset represents a sample, but my samples are 2D table of numbers. The features are the same in all samples but their length (rows) varies from sample to sample, like tables below.
Sample1:
| Feature 1 | Feature 2 | ... | Feature n | Result |
|---|---|---|---|---|
| row 1F1 | row 1F2 | ... | row 1Fn | S1 |
| row 2F1 | row 2F2 | ... | row 2Fn | S1 |
| ... | ... | ... | ... | |
| row 1000F1 | row 1000F2 | ... | row 1000Fn | S1 |
Sample2:
| Feature 1 | Feature 2 | ... | Feature n | Result |
|---|---|---|---|---|
| row 1F1 | row 1F2 | ... | row 1Fn | S2 |
| row 2F1 | row 2F2 | ... | row 2Fn | S2 |
| ... | ... | ... | ... | |
| row 1200F1 | row 1200F2 | ... | row 1200Fn | S2 |
Feature are the same, for example let's say feature 1 represents "Name" in all samples, feature 2 represents "city" and so on; but the number of rows in each sample varies.
I can not put away some rows to make the samples equal in length since they affect result. I also can not turn rows into new features to transform samples into 1D array of numbers because there are too many rows to do that.
- I want to know do I need to somehow normalize the dimension of the samples? (if yes how can I do that?)
- or Is there a deep learning model that can take inputs with different dimensions like my samples?
(I am using Python; I read that one way is to encode and decode the samples using LSTM seq-2-seq but I don't know if it is the right way to handle my dataset.)
Thanks in advance.