Grouping and comparing
Dear forum members,
I'm kind of a noob when it comes to statistics
Currently I'm working on a research project where I use SPSS to construct a large dataset of historical varnish recipes. Such recipes consist of list of products, so called ingredients (var. Ingred), that are needed to be mixed in order to create a varnish.
Below I've made a simplified version of my current dataset in order to support my question:
This example shows only 3 of the 600+ recipes that I've collected. The variable Recipe_nr gives a number to each recipe. As I said above, each recipes consists of a number of components which I have listed per recipe on a separate row: Recipe 2 has 9 components (row 2-10), recipe 3 four (row 12-15) and 4 six (17-22).
My question is whether there is the possibility to group each recipe by using the 'Recipe_nr' variable, so SPSS identifies all 9 components as belonging to recipe 2 and all 4 to recipe 3 and so on. I would like to do this so I can compare all recipes with one another for similarities based on their components. For me it is important to see some kind of relationship between the components list between all recipes as it is known that these were copied throughout times. So I expect duplicate cases or recipes where only 1 or 2 ingredients differ from each other. Could hierarchical clustering also help me out identifying relations?
or maybe grouping isn't the proper way for this sort of analyses so every help and suggestions are more than welcome.