Hi,
"Developing a data relationship" is glib for some very sordid tasks including, but not limited to (to quote the full legal equivalent), delivering a list of customers at the drop of an SKU number. We can't deal with the 'not limited to' part here but, if you like, we can concentrate on getting a list of customers associated with an SKU.
Your List 2 has the customer name but the SKU that is associated with the name may or may not be the one you are looking for. The customer might buy an assembly that contains your search criterion, therefore qualifying for inclusion in the list, or your search might be for the very SKU he buys. In either case, you draw your SKU number from List 1. This direct or indirect search would guarantee that any formula you might design will be lengthy and complicated. In fact, it is this complication that speaks against it.
It is probably possible to write an Array Formula that can do the job, or to write two different formulae that can do either the direct or the indirect search. For the purpose of my argument there is no difference because you need an expert to apply either one or to know which one to apply. Such a formula wouldn't be capable of automatic adjustment. Lists of half a million entries change constantly, and the formula would have to be tweaked to match the precise current parameters of both lists. Even to freeze these parameters for long enough for the formula to run may pose a problem. I am referring to addition and deletion of rows here, not changes in values, and it may not be possible to create a result that will not crash when the number of included rows changes.
In case you aren't familiar with Array Formulae, this would be a single formula capable of repeating itself, say, 5000 times to create a list. You could Copy and Paste Special to fixate that list, but until you do it is virtual in nature and it is liable to fail whenever it's parameters aren't what they were expected to be. So, you would be stuck with copying and pasting lists. If they have a few thousand names you will want to automate the task. You would want a macro for that purpose.
Producing that same list with a macro in the first place has none of the drawbacks I have outlined above. There is very little code involved which not only would be much more flexible than an Array Formula with regard to changing numbers of rows but also easier for the layman to understand and, hence, apply and maintain.
It seems that the drift of my analysis is to shift the goal posts. The choice wouldn't be so much between macro and formula as between virtual and fixed lists. Consider that a formula shows a result of it's calculation, but that result changes with the function's parameters which may include their becoming inaccessible causing the formula to fail. If your need is for a fixed list a formula would be unable - other than indirectly - to provide it.
Happy New Year!
Have a great day!
Sisyphus
I do this for "honour and country" - much less of the latter, actually.
If I helped you, award points, plenty of them.
If I bored you, deduct points for being too long-winded. (I know, :lol)