Is it ever okay to use lists in a relational database?











up vote
69
down vote

favorite
13












I've been trying to design a database to go with a project concept and ran into what seems like a hotly debated issue. I've read a few articles and some Stack Overflow answers that state it's never (or almost never) okay to store a list of IDs or the like in a field -- all data should be relational, etc.



The problem I'm running into, though, is that I'm trying to make a task assigner. People will create tasks, assign them to multiple people, and it will save to the database.



Of course, if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns and micro-manage them because there can be 0 to 100 tasks assigned to one person, say.



Then again, if I save the tasks in a "Tasks" table, I'll have to have dozens of dummy "PersonID" columns and micro-manage them -- same problem as before.



For a problem like this, is it okay to save a list of IDs taking one form or another or am I just not thinking of another way this is achievable without breaking principles?










share|improve this question









New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 17




    I realize this is tagged "relational database" so I'll just leave it as a comment not an answer, but in other types of databases it does make sense to store lists. Cassandra comes to mind since it has no joins.
    – Captain Man
    Nov 14 at 15:17






  • 7




    Good job in researching and then asking here! Indeed, the 'recommendation' to never violate the 1st normal form did really well for you, because you really should come up with another, relational approach, namely a "many-to-many" relation, for which there is a standard pattern in relational databases which should be used.
    – JimmyB
    2 days ago








  • 4




    "Is it ever okay" yes.... whatever follows, the answer is yes. As long as you have a valid reason. There's always a use case that compels you to violate best practices because it makes sense to do so. (In your case, though, you definitely shouldn't)
    – xyious
    2 days ago






  • 1




    I'm currently using an array (not a delimited string -- a VARCHAR ARRAY) to store a list of tags. That's probably not how they'll end up being stored later down the line, but lists can be extremely useful during the prototyping stages, when you have nothing else to point to and don't want to build out the entire database schema before you can do anything else.
    – Nic Hartley
    2 days ago






  • 1




    If you do this then at some point, somewhere down the line, you'll regret it
    – Caius Jard
    2 days ago

















up vote
69
down vote

favorite
13












I've been trying to design a database to go with a project concept and ran into what seems like a hotly debated issue. I've read a few articles and some Stack Overflow answers that state it's never (or almost never) okay to store a list of IDs or the like in a field -- all data should be relational, etc.



The problem I'm running into, though, is that I'm trying to make a task assigner. People will create tasks, assign them to multiple people, and it will save to the database.



Of course, if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns and micro-manage them because there can be 0 to 100 tasks assigned to one person, say.



Then again, if I save the tasks in a "Tasks" table, I'll have to have dozens of dummy "PersonID" columns and micro-manage them -- same problem as before.



For a problem like this, is it okay to save a list of IDs taking one form or another or am I just not thinking of another way this is achievable without breaking principles?










share|improve this question









New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 17




    I realize this is tagged "relational database" so I'll just leave it as a comment not an answer, but in other types of databases it does make sense to store lists. Cassandra comes to mind since it has no joins.
    – Captain Man
    Nov 14 at 15:17






  • 7




    Good job in researching and then asking here! Indeed, the 'recommendation' to never violate the 1st normal form did really well for you, because you really should come up with another, relational approach, namely a "many-to-many" relation, for which there is a standard pattern in relational databases which should be used.
    – JimmyB
    2 days ago








  • 4




    "Is it ever okay" yes.... whatever follows, the answer is yes. As long as you have a valid reason. There's always a use case that compels you to violate best practices because it makes sense to do so. (In your case, though, you definitely shouldn't)
    – xyious
    2 days ago






  • 1




    I'm currently using an array (not a delimited string -- a VARCHAR ARRAY) to store a list of tags. That's probably not how they'll end up being stored later down the line, but lists can be extremely useful during the prototyping stages, when you have nothing else to point to and don't want to build out the entire database schema before you can do anything else.
    – Nic Hartley
    2 days ago






  • 1




    If you do this then at some point, somewhere down the line, you'll regret it
    – Caius Jard
    2 days ago















up vote
69
down vote

favorite
13









up vote
69
down vote

favorite
13






13





I've been trying to design a database to go with a project concept and ran into what seems like a hotly debated issue. I've read a few articles and some Stack Overflow answers that state it's never (or almost never) okay to store a list of IDs or the like in a field -- all data should be relational, etc.



The problem I'm running into, though, is that I'm trying to make a task assigner. People will create tasks, assign them to multiple people, and it will save to the database.



Of course, if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns and micro-manage them because there can be 0 to 100 tasks assigned to one person, say.



Then again, if I save the tasks in a "Tasks" table, I'll have to have dozens of dummy "PersonID" columns and micro-manage them -- same problem as before.



For a problem like this, is it okay to save a list of IDs taking one form or another or am I just not thinking of another way this is achievable without breaking principles?










share|improve this question









New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I've been trying to design a database to go with a project concept and ran into what seems like a hotly debated issue. I've read a few articles and some Stack Overflow answers that state it's never (or almost never) okay to store a list of IDs or the like in a field -- all data should be relational, etc.



The problem I'm running into, though, is that I'm trying to make a task assigner. People will create tasks, assign them to multiple people, and it will save to the database.



Of course, if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns and micro-manage them because there can be 0 to 100 tasks assigned to one person, say.



Then again, if I save the tasks in a "Tasks" table, I'll have to have dozens of dummy "PersonID" columns and micro-manage them -- same problem as before.



For a problem like this, is it okay to save a list of IDs taking one form or another or am I just not thinking of another way this is achievable without breaking principles?







database database-design sql relational-database






share|improve this question









New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Nov 15 at 5:12









Akavall

2571210




2571210






New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Nov 14 at 4:25









linus72982

454126




454126




New contributor




linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






linus72982 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 17




    I realize this is tagged "relational database" so I'll just leave it as a comment not an answer, but in other types of databases it does make sense to store lists. Cassandra comes to mind since it has no joins.
    – Captain Man
    Nov 14 at 15:17






  • 7




    Good job in researching and then asking here! Indeed, the 'recommendation' to never violate the 1st normal form did really well for you, because you really should come up with another, relational approach, namely a "many-to-many" relation, for which there is a standard pattern in relational databases which should be used.
    – JimmyB
    2 days ago








  • 4




    "Is it ever okay" yes.... whatever follows, the answer is yes. As long as you have a valid reason. There's always a use case that compels you to violate best practices because it makes sense to do so. (In your case, though, you definitely shouldn't)
    – xyious
    2 days ago






  • 1




    I'm currently using an array (not a delimited string -- a VARCHAR ARRAY) to store a list of tags. That's probably not how they'll end up being stored later down the line, but lists can be extremely useful during the prototyping stages, when you have nothing else to point to and don't want to build out the entire database schema before you can do anything else.
    – Nic Hartley
    2 days ago






  • 1




    If you do this then at some point, somewhere down the line, you'll regret it
    – Caius Jard
    2 days ago
















  • 17




    I realize this is tagged "relational database" so I'll just leave it as a comment not an answer, but in other types of databases it does make sense to store lists. Cassandra comes to mind since it has no joins.
    – Captain Man
    Nov 14 at 15:17






  • 7




    Good job in researching and then asking here! Indeed, the 'recommendation' to never violate the 1st normal form did really well for you, because you really should come up with another, relational approach, namely a "many-to-many" relation, for which there is a standard pattern in relational databases which should be used.
    – JimmyB
    2 days ago








  • 4




    "Is it ever okay" yes.... whatever follows, the answer is yes. As long as you have a valid reason. There's always a use case that compels you to violate best practices because it makes sense to do so. (In your case, though, you definitely shouldn't)
    – xyious
    2 days ago






  • 1




    I'm currently using an array (not a delimited string -- a VARCHAR ARRAY) to store a list of tags. That's probably not how they'll end up being stored later down the line, but lists can be extremely useful during the prototyping stages, when you have nothing else to point to and don't want to build out the entire database schema before you can do anything else.
    – Nic Hartley
    2 days ago






  • 1




    If you do this then at some point, somewhere down the line, you'll regret it
    – Caius Jard
    2 days ago










17




17




I realize this is tagged "relational database" so I'll just leave it as a comment not an answer, but in other types of databases it does make sense to store lists. Cassandra comes to mind since it has no joins.
– Captain Man
Nov 14 at 15:17




I realize this is tagged "relational database" so I'll just leave it as a comment not an answer, but in other types of databases it does make sense to store lists. Cassandra comes to mind since it has no joins.
– Captain Man
Nov 14 at 15:17




7




7




Good job in researching and then asking here! Indeed, the 'recommendation' to never violate the 1st normal form did really well for you, because you really should come up with another, relational approach, namely a "many-to-many" relation, for which there is a standard pattern in relational databases which should be used.
– JimmyB
2 days ago






Good job in researching and then asking here! Indeed, the 'recommendation' to never violate the 1st normal form did really well for you, because you really should come up with another, relational approach, namely a "many-to-many" relation, for which there is a standard pattern in relational databases which should be used.
– JimmyB
2 days ago






4




4




"Is it ever okay" yes.... whatever follows, the answer is yes. As long as you have a valid reason. There's always a use case that compels you to violate best practices because it makes sense to do so. (In your case, though, you definitely shouldn't)
– xyious
2 days ago




"Is it ever okay" yes.... whatever follows, the answer is yes. As long as you have a valid reason. There's always a use case that compels you to violate best practices because it makes sense to do so. (In your case, though, you definitely shouldn't)
– xyious
2 days ago




1




1




I'm currently using an array (not a delimited string -- a VARCHAR ARRAY) to store a list of tags. That's probably not how they'll end up being stored later down the line, but lists can be extremely useful during the prototyping stages, when you have nothing else to point to and don't want to build out the entire database schema before you can do anything else.
– Nic Hartley
2 days ago




I'm currently using an array (not a delimited string -- a VARCHAR ARRAY) to store a list of tags. That's probably not how they'll end up being stored later down the line, but lists can be extremely useful during the prototyping stages, when you have nothing else to point to and don't want to build out the entire database schema before you can do anything else.
– Nic Hartley
2 days ago




1




1




If you do this then at some point, somewhere down the line, you'll regret it
– Caius Jard
2 days ago






If you do this then at some point, somewhere down the line, you'll regret it
– Caius Jard
2 days ago












8 Answers
8






active

oldest

votes

















up vote
209
down vote



accepted










The key word and key concept you need to investigate is database normalization.



What you would do, is rather than adding info about the assignments to the person or tasks tables, is you add a new table with that assignment info, with relevant relationships.



Example, you have the following tables:



Persons:



+----+-----------+
| ID | Name |
+====+===========+
| 1 | Alfred |
| 2 | Jebediah |
| 3 | Jacob |
| 4 | Ezekiel |
+----+-----------+


Tasks:



+----+--------------------+
| ID | Name |
+====+====================+
| 1 | Feed the Chickens |
| 2 | Plow |
| 3 | Milking Cows |
| 4 | Raise a barn |
+----+--------------------+


You would then create a third table with Assignments. This table would model the relationship between the people and the tasks:



+----+-----------+---------+
| ID | PersonId | TaskId |
+====+===========+=========+
| 1 | 1 | 3 |
| 2 | 3 | 2 |
| 3 | 2 | 1 |
| 4 | 1 | 4 |
+----+-----------+---------+


We would then have a Foreign Key constraint, such that the database will enforce that the PersonId and TaskIds have to be valid IDs for those foreign items. For the first row, we can see PersonId is 1, so Alfred, is assigned to TaskId 3, Milking cows.



What you should be able to see here is that you could have as few or as many assignments per task or per person as you want. In this example, Ezekiel isn't assigned any tasks, and Alfred is assigned 2. If you have one task with 100 people, doing SELECT PersonId from Assignments WHERE TaskId=<whatever>; will yield 100 rows, with a variety of different Persons assigned. You can WHERE on the PersonId to find all of the tasks assigned to that person.



If you want to return queries replacing the Ids with the Names and the tasks, then you get to learn how to JOIN tables.






share|improve this answer



















  • 69




    The keyword you want to search to learn more is "many-to-many relationship"
    – BlueRaja - Danny Pflughoeft
    Nov 14 at 9:38








  • 27




    To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
    – Jan Doggen
    Nov 14 at 10:18








  • 7




    Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
    – Lightness Races in Orbit
    Nov 14 at 11:57








  • 11




    @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
    – whatsisname
    Nov 14 at 20:21








  • 6




    @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
    – jpmc26
    Nov 14 at 22:00




















up vote
29
down vote













You're asking two questions here.



First, you ask if its ok to store lists serialized in a column. Yes, its fine. If your project calls for it. An example might be product ingredients for a catalog page, where you have no desire to try to track each ingredient individually.



Unfortunately your second question describes a scenario where you should opt for a more relational approach. You'll need 3 tables. One for the people, one for the tasks, and one that maintains the list of which task is assigned to which people. That last one would be vertical, one row per person/task combination, with columns for your primary key, task id, and person id.






share|improve this answer

















  • 8




    The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
    – Flater
    Nov 14 at 11:11








  • 9




    @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
    – Kevin
    Nov 14 at 18:48








  • 9




    @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
    – Dan Bron
    Nov 14 at 20:49








  • 9




    @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
    – Kevin
    Nov 14 at 20:54








  • 6




    @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
    – Flater
    2 days ago




















up vote
20
down vote













What you're describing is known as a "many to many" relationship, in your case between Person and Task. It's typically implemented using a third table, sometimes called a "link" or "cross-reference" table. For example:



create table person (
person_id integer primary key,
...
);

create table task (
task_id integer primary key,
...
);

create table person_task_xref (
person_id integer not null,
task_id integer not null,
primary key (person_id, task_id),
foreign key (person_id) references person (person_id),
foreign key (task_id) references task (task_id)
);





share|improve this answer

















  • 2




    You may also want to add an index with task_id first, if you might be doing queries filtered by task.
    – jpmc26
    yesterday










  • Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
    – jmoreno
    3 hours ago


















up vote
11
down vote














... it's never (or almost never) okay to store a list of IDs or the like in a field




The only time you might store more than one data item in a single field is when that field is only ever used as a single entity and is never considered as being made up of those smaller elements. An example might be an image, stored in a BLOB field. It's made up of lots and lots of smaller elements (bytes) but these that mean nothing to the database and can only be used all together (and look pretty to an End User).



Since a "list" is, by definition, made up of smaller elements (items), this isn't the case here and you should normalise the data.




... if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns ...




No. You'll have a few rows in an Intersection Table (a.k.a. Weak Entity) between Person and Task. Databases are really good at working with lots of rows; they're actually pretty rubbish at working with lots of [repeated] columns.



Nice clear example given by whatsisname.






share|improve this answer

















  • 3




    When creating real life systems "never say never" is a very good rule to live by.
    – l0b0
    Nov 14 at 21:31






  • 1




    In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
    – supercat
    yesterday


















up vote
2
down vote













It may be legitimate in certain pre-calculated fields.



If some of your queries are expensive and you decide to go with pre-calculated fields updated automatically using database triggers, then it may be legitimate to keep the lists inside a column.



For example, in the UI you want to show this list using grid view, where each row can open full details (with complete lists) after double-clicking:



REGISTERED USER LIST
+------------------+----------------------------------------------------+
|Name |Top 3 most visited tags |
+==================+====================================================+
|Peter |Design, Fitness, Gifts |
+------------------+----------------------------------------------------+
|Lucy |Fashion, Gifts, Lifestyle |
+------------------+----------------------------------------------------+


You are keeping the second column updated by trigger when client visits new article or by scheduled task.



You can make such a field available even for searching (as normal text).



For such cases, keeping lists is legitimate. You just need to consider case of possibly exceeding maximum field length.





Also, if you are using Microsoft Access, offered multivalued fields are another special use case. They handle your lists in a field automatically.



But you can always fall back to standard normalized form shown in other answers.





Summary: Normal forms of database are theoretical model required for understanding important aspects of data modeling. But of course normalization does not take into account performance or other cost of retrieving the data. It is out of scope of that theoretical model. But storing lists or other pre-calculated (and controlled) duplicates is often required by practical implementation.



In the light of the above, in practical implementation, would we prefer query relying on perfect normal form and running 20 seconds or equivalent query relying on pre-calculated values which takes 0.08 s? No one likes their software product to be accused of slowness.






share|improve this answer



















  • 1




    It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
    – Loren Pechtel
    Nov 15 at 4:10










  • @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
    – miroxlav
    2 days ago












  • @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
    – Tezra
    yesterday










  • @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
    – Loren Pechtel
    yesterday


















up vote
0
down vote













Given two tables; we'll call them Person and Task, each with it's own ID (PersonID, TaskID)... the basic idea is to create a third table to bind them together. We'll call this table PersonToTask. At the minimum it should have it's own ID, as well as the two others
So when it comes to assigning someone to a task; you will no longer need to UPDATE the Person table, you just need to INSERT a new line into the PersonToTaskTable.
And maintenance becomes easier- need to delete a task just becomes a DELETE based on TaskID, no more updating the Person table and it's associated parsing



CREATE TABLE dbo.PersonToTask (
pttID INT IDENTITY(1,1) NOT NULL,
PersonID INT NULL,
TaskID INT NULL
)

CREATE PROCEDURE dbo.Task_Assigned (@PersonID INT, @TaskID INT)
AS
BEGIN
INSERT PersonToTask (PersonID, TaskID)
VALUES (@PersonID, @TaskID)
END

CREATE PROCEDURE dbo.Task_Deleted (@TaskID INT)
AS
BEGIN
DELETE PersonToTask WHERE TaskID = @TaskID
DELETE Task WHERE TaskID = @TaskID
END


How about a simple report or who's all assigned to a task?



CREATE PROCEDURE dbo.Task_CurrentAssigned (@TaskID INT)
AS
BEGIN
SELECT PersonName
FROM dbo.Person
WHERE PersonID IN (SELECT PersonID FROM dbo.PersonToTask WHERE TaskID = @TaskID)
END


You of course could do a lot more; a TimeReport could be done if you added DateTime fields for TaskAssigned and TaskCompleted. It's all up to you






share|improve this answer








New contributor




Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    up vote
    0
    down vote













    It may work if say you have human readable Primary keys and want a list of task #'s without having to deal with vertical nature of a table structure. i.e. much easier to read first table.



    ------------------------  
    Employee Name | Task
    Jack | 1,2,5
    Jill | 4,6,7
    ------------------------

    ------------------------
    Employee Name | Task
    Jack | 1
    Jack | 2
    Jack | 5
    Jill | 4
    Jill | 6
    Jill | 7
    ------------------------


    The question would then be: should the task list be stored or generated on demand, which largely would depend on requirements such as: how often the list are needed, how accurate how many data rows exist, how the data will be used, etc... after which analyzing the trade offs to user experience and meeting requirements should be done.



    For example comparing the time it would take to recall the 2 rows vs running a query that would generate the 2 rows. If it takes long and the user does not need the most up to date list(*expecting less than 1 change per day) then it could be stored.



    Or if the user needs a historical record of tasks assigned to them it would also make sense if the list was stored. So it really depends on what you are doing, never say never.






    share|improve this answer








    New contributor




    Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.


















    • As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
      – Phill W.
      2 days ago


















    up vote
    0
    down vote













    You're taking what should be another table, turning it through 90 degrees and shoehorning it into another table.



    It's like having an order table where you have itemProdcode1, itemQuantity1, itemPrice1 ... itemProdcode37, itemQuantity37, itemPrice37. Apart from being awkward to handle programmatically you can guarantee that tomorrow someone will want to order 38 things.



    I'd only do it your way if the 'list' isn't really a list, i.e. where it stands as a whole and each individual line item doesn't refer to some clear and independent entity. In that case just stuff it all in some data type that's big enough.



    So an order is a list, a Bill Of Materials is a list (or a list of lists, which would be even more of a nightmare to implement "sideways"). But a note/comment and a poem aren't.






    share|improve this answer








    New contributor




    Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.

















      protected by gnat Nov 15 at 5:28



      Thank you for your interest in this question.
      Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



      Would you like to answer one of these unanswered questions instead?














      8 Answers
      8






      active

      oldest

      votes








      8 Answers
      8






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      209
      down vote



      accepted










      The key word and key concept you need to investigate is database normalization.



      What you would do, is rather than adding info about the assignments to the person or tasks tables, is you add a new table with that assignment info, with relevant relationships.



      Example, you have the following tables:



      Persons:



      +----+-----------+
      | ID | Name |
      +====+===========+
      | 1 | Alfred |
      | 2 | Jebediah |
      | 3 | Jacob |
      | 4 | Ezekiel |
      +----+-----------+


      Tasks:



      +----+--------------------+
      | ID | Name |
      +====+====================+
      | 1 | Feed the Chickens |
      | 2 | Plow |
      | 3 | Milking Cows |
      | 4 | Raise a barn |
      +----+--------------------+


      You would then create a third table with Assignments. This table would model the relationship between the people and the tasks:



      +----+-----------+---------+
      | ID | PersonId | TaskId |
      +====+===========+=========+
      | 1 | 1 | 3 |
      | 2 | 3 | 2 |
      | 3 | 2 | 1 |
      | 4 | 1 | 4 |
      +----+-----------+---------+


      We would then have a Foreign Key constraint, such that the database will enforce that the PersonId and TaskIds have to be valid IDs for those foreign items. For the first row, we can see PersonId is 1, so Alfred, is assigned to TaskId 3, Milking cows.



      What you should be able to see here is that you could have as few or as many assignments per task or per person as you want. In this example, Ezekiel isn't assigned any tasks, and Alfred is assigned 2. If you have one task with 100 people, doing SELECT PersonId from Assignments WHERE TaskId=<whatever>; will yield 100 rows, with a variety of different Persons assigned. You can WHERE on the PersonId to find all of the tasks assigned to that person.



      If you want to return queries replacing the Ids with the Names and the tasks, then you get to learn how to JOIN tables.






      share|improve this answer



















      • 69




        The keyword you want to search to learn more is "many-to-many relationship"
        – BlueRaja - Danny Pflughoeft
        Nov 14 at 9:38








      • 27




        To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
        – Jan Doggen
        Nov 14 at 10:18








      • 7




        Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
        – Lightness Races in Orbit
        Nov 14 at 11:57








      • 11




        @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
        – whatsisname
        Nov 14 at 20:21








      • 6




        @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
        – jpmc26
        Nov 14 at 22:00

















      up vote
      209
      down vote



      accepted










      The key word and key concept you need to investigate is database normalization.



      What you would do, is rather than adding info about the assignments to the person or tasks tables, is you add a new table with that assignment info, with relevant relationships.



      Example, you have the following tables:



      Persons:



      +----+-----------+
      | ID | Name |
      +====+===========+
      | 1 | Alfred |
      | 2 | Jebediah |
      | 3 | Jacob |
      | 4 | Ezekiel |
      +----+-----------+


      Tasks:



      +----+--------------------+
      | ID | Name |
      +====+====================+
      | 1 | Feed the Chickens |
      | 2 | Plow |
      | 3 | Milking Cows |
      | 4 | Raise a barn |
      +----+--------------------+


      You would then create a third table with Assignments. This table would model the relationship between the people and the tasks:



      +----+-----------+---------+
      | ID | PersonId | TaskId |
      +====+===========+=========+
      | 1 | 1 | 3 |
      | 2 | 3 | 2 |
      | 3 | 2 | 1 |
      | 4 | 1 | 4 |
      +----+-----------+---------+


      We would then have a Foreign Key constraint, such that the database will enforce that the PersonId and TaskIds have to be valid IDs for those foreign items. For the first row, we can see PersonId is 1, so Alfred, is assigned to TaskId 3, Milking cows.



      What you should be able to see here is that you could have as few or as many assignments per task or per person as you want. In this example, Ezekiel isn't assigned any tasks, and Alfred is assigned 2. If you have one task with 100 people, doing SELECT PersonId from Assignments WHERE TaskId=<whatever>; will yield 100 rows, with a variety of different Persons assigned. You can WHERE on the PersonId to find all of the tasks assigned to that person.



      If you want to return queries replacing the Ids with the Names and the tasks, then you get to learn how to JOIN tables.






      share|improve this answer



















      • 69




        The keyword you want to search to learn more is "many-to-many relationship"
        – BlueRaja - Danny Pflughoeft
        Nov 14 at 9:38








      • 27




        To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
        – Jan Doggen
        Nov 14 at 10:18








      • 7




        Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
        – Lightness Races in Orbit
        Nov 14 at 11:57








      • 11




        @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
        – whatsisname
        Nov 14 at 20:21








      • 6




        @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
        – jpmc26
        Nov 14 at 22:00















      up vote
      209
      down vote



      accepted







      up vote
      209
      down vote



      accepted






      The key word and key concept you need to investigate is database normalization.



      What you would do, is rather than adding info about the assignments to the person or tasks tables, is you add a new table with that assignment info, with relevant relationships.



      Example, you have the following tables:



      Persons:



      +----+-----------+
      | ID | Name |
      +====+===========+
      | 1 | Alfred |
      | 2 | Jebediah |
      | 3 | Jacob |
      | 4 | Ezekiel |
      +----+-----------+


      Tasks:



      +----+--------------------+
      | ID | Name |
      +====+====================+
      | 1 | Feed the Chickens |
      | 2 | Plow |
      | 3 | Milking Cows |
      | 4 | Raise a barn |
      +----+--------------------+


      You would then create a third table with Assignments. This table would model the relationship between the people and the tasks:



      +----+-----------+---------+
      | ID | PersonId | TaskId |
      +====+===========+=========+
      | 1 | 1 | 3 |
      | 2 | 3 | 2 |
      | 3 | 2 | 1 |
      | 4 | 1 | 4 |
      +----+-----------+---------+


      We would then have a Foreign Key constraint, such that the database will enforce that the PersonId and TaskIds have to be valid IDs for those foreign items. For the first row, we can see PersonId is 1, so Alfred, is assigned to TaskId 3, Milking cows.



      What you should be able to see here is that you could have as few or as many assignments per task or per person as you want. In this example, Ezekiel isn't assigned any tasks, and Alfred is assigned 2. If you have one task with 100 people, doing SELECT PersonId from Assignments WHERE TaskId=<whatever>; will yield 100 rows, with a variety of different Persons assigned. You can WHERE on the PersonId to find all of the tasks assigned to that person.



      If you want to return queries replacing the Ids with the Names and the tasks, then you get to learn how to JOIN tables.






      share|improve this answer














      The key word and key concept you need to investigate is database normalization.



      What you would do, is rather than adding info about the assignments to the person or tasks tables, is you add a new table with that assignment info, with relevant relationships.



      Example, you have the following tables:



      Persons:



      +----+-----------+
      | ID | Name |
      +====+===========+
      | 1 | Alfred |
      | 2 | Jebediah |
      | 3 | Jacob |
      | 4 | Ezekiel |
      +----+-----------+


      Tasks:



      +----+--------------------+
      | ID | Name |
      +====+====================+
      | 1 | Feed the Chickens |
      | 2 | Plow |
      | 3 | Milking Cows |
      | 4 | Raise a barn |
      +----+--------------------+


      You would then create a third table with Assignments. This table would model the relationship between the people and the tasks:



      +----+-----------+---------+
      | ID | PersonId | TaskId |
      +====+===========+=========+
      | 1 | 1 | 3 |
      | 2 | 3 | 2 |
      | 3 | 2 | 1 |
      | 4 | 1 | 4 |
      +----+-----------+---------+


      We would then have a Foreign Key constraint, such that the database will enforce that the PersonId and TaskIds have to be valid IDs for those foreign items. For the first row, we can see PersonId is 1, so Alfred, is assigned to TaskId 3, Milking cows.



      What you should be able to see here is that you could have as few or as many assignments per task or per person as you want. In this example, Ezekiel isn't assigned any tasks, and Alfred is assigned 2. If you have one task with 100 people, doing SELECT PersonId from Assignments WHERE TaskId=<whatever>; will yield 100 rows, with a variety of different Persons assigned. You can WHERE on the PersonId to find all of the tasks assigned to that person.



      If you want to return queries replacing the Ids with the Names and the tasks, then you get to learn how to JOIN tables.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited yesterday









      Basil Bourque

      44928




      44928










      answered Nov 14 at 4:47









      whatsisname

      23.7k136382




      23.7k136382








      • 69




        The keyword you want to search to learn more is "many-to-many relationship"
        – BlueRaja - Danny Pflughoeft
        Nov 14 at 9:38








      • 27




        To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
        – Jan Doggen
        Nov 14 at 10:18








      • 7




        Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
        – Lightness Races in Orbit
        Nov 14 at 11:57








      • 11




        @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
        – whatsisname
        Nov 14 at 20:21








      • 6




        @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
        – jpmc26
        Nov 14 at 22:00
















      • 69




        The keyword you want to search to learn more is "many-to-many relationship"
        – BlueRaja - Danny Pflughoeft
        Nov 14 at 9:38








      • 27




        To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
        – Jan Doggen
        Nov 14 at 10:18








      • 7




        Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
        – Lightness Races in Orbit
        Nov 14 at 11:57








      • 11




        @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
        – whatsisname
        Nov 14 at 20:21








      • 6




        @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
        – jpmc26
        Nov 14 at 22:00










      69




      69




      The keyword you want to search to learn more is "many-to-many relationship"
      – BlueRaja - Danny Pflughoeft
      Nov 14 at 9:38






      The keyword you want to search to learn more is "many-to-many relationship"
      – BlueRaja - Danny Pflughoeft
      Nov 14 at 9:38






      27




      27




      To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
      – Jan Doggen
      Nov 14 at 10:18






      To elaborate a little on Thierrys comment: You may think that you do not need to normalize because I only need X and it is very simple to store the ID list, but for any system that may get extended later you will regret not having normalized it earlier. Always normalize; the only question is to what normal form
      – Jan Doggen
      Nov 14 at 10:18






      7




      7




      Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
      – Lightness Races in Orbit
      Nov 14 at 11:57






      Agreed with @Jan - against my better judgement I permitted my team to take a design shortcut a while back, storing JSON instead for something that "won't need to be extended". That lasted like six months FML. Our upgrader then had a nasty fight on its hands to migrate the JSON to the scheme we should have started with. I really should have known better.
      – Lightness Races in Orbit
      Nov 14 at 11:57






      11




      11




      @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
      – whatsisname
      Nov 14 at 20:21






      @Deduplicator: it's just a representation of a garden-variety, auto-increment integer primary key column. Pretty typical stuff.
      – whatsisname
      Nov 14 at 20:21






      6




      6




      @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
      – jpmc26
      Nov 14 at 22:00






      @whatsisname On the Persons or Tasks table, I'd agree with you. On a bridge table where the sole purpose is to represent the many-to-many relationship between two other tables that already have surrogate keys? I wouldn't add one without a good reason. It's just overhead as it will never be used in queries or relationships.
      – jpmc26
      Nov 14 at 22:00














      up vote
      29
      down vote













      You're asking two questions here.



      First, you ask if its ok to store lists serialized in a column. Yes, its fine. If your project calls for it. An example might be product ingredients for a catalog page, where you have no desire to try to track each ingredient individually.



      Unfortunately your second question describes a scenario where you should opt for a more relational approach. You'll need 3 tables. One for the people, one for the tasks, and one that maintains the list of which task is assigned to which people. That last one would be vertical, one row per person/task combination, with columns for your primary key, task id, and person id.






      share|improve this answer

















      • 8




        The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
        – Flater
        Nov 14 at 11:11








      • 9




        @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
        – Kevin
        Nov 14 at 18:48








      • 9




        @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
        – Dan Bron
        Nov 14 at 20:49








      • 9




        @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
        – Kevin
        Nov 14 at 20:54








      • 6




        @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
        – Flater
        2 days ago

















      up vote
      29
      down vote













      You're asking two questions here.



      First, you ask if its ok to store lists serialized in a column. Yes, its fine. If your project calls for it. An example might be product ingredients for a catalog page, where you have no desire to try to track each ingredient individually.



      Unfortunately your second question describes a scenario where you should opt for a more relational approach. You'll need 3 tables. One for the people, one for the tasks, and one that maintains the list of which task is assigned to which people. That last one would be vertical, one row per person/task combination, with columns for your primary key, task id, and person id.






      share|improve this answer

















      • 8




        The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
        – Flater
        Nov 14 at 11:11








      • 9




        @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
        – Kevin
        Nov 14 at 18:48








      • 9




        @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
        – Dan Bron
        Nov 14 at 20:49








      • 9




        @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
        – Kevin
        Nov 14 at 20:54








      • 6




        @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
        – Flater
        2 days ago















      up vote
      29
      down vote










      up vote
      29
      down vote









      You're asking two questions here.



      First, you ask if its ok to store lists serialized in a column. Yes, its fine. If your project calls for it. An example might be product ingredients for a catalog page, where you have no desire to try to track each ingredient individually.



      Unfortunately your second question describes a scenario where you should opt for a more relational approach. You'll need 3 tables. One for the people, one for the tasks, and one that maintains the list of which task is assigned to which people. That last one would be vertical, one row per person/task combination, with columns for your primary key, task id, and person id.






      share|improve this answer












      You're asking two questions here.



      First, you ask if its ok to store lists serialized in a column. Yes, its fine. If your project calls for it. An example might be product ingredients for a catalog page, where you have no desire to try to track each ingredient individually.



      Unfortunately your second question describes a scenario where you should opt for a more relational approach. You'll need 3 tables. One for the people, one for the tasks, and one that maintains the list of which task is assigned to which people. That last one would be vertical, one row per person/task combination, with columns for your primary key, task id, and person id.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Nov 14 at 4:48









      GrandmasterB

      34.9k569121




      34.9k569121








      • 8




        The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
        – Flater
        Nov 14 at 11:11








      • 9




        @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
        – Kevin
        Nov 14 at 18:48








      • 9




        @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
        – Dan Bron
        Nov 14 at 20:49








      • 9




        @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
        – Kevin
        Nov 14 at 20:54








      • 6




        @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
        – Flater
        2 days ago
















      • 8




        The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
        – Flater
        Nov 14 at 11:11








      • 9




        @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
        – Kevin
        Nov 14 at 18:48








      • 9




        @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
        – Dan Bron
        Nov 14 at 20:49








      • 9




        @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
        – Kevin
        Nov 14 at 20:54








      • 6




        @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
        – Flater
        2 days ago










      8




      8




      The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
      – Flater
      Nov 14 at 11:11






      The ingredient example you reference is correct on the surface; but it would be plaintext in that case. It is not a list in the programming sense (unless you mean that the string is a list of characters which you obviously don't). OP describing their data as "a list of IDs" (or even just "a list of [..]") implies that they are at some point handling this data as individual objects.
      – Flater
      Nov 14 at 11:11






      9




      9




      @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
      – Kevin
      Nov 14 at 18:48






      @Flater: But it is a list. You need to be able to reformat it as (variously) an HTML list, a Markdown list, a JSON list, etc. in order to ensure the items are displayed properly in (variously) a web page, a plain text document, a mobile app... and you can't really do that with plain text.
      – Kevin
      Nov 14 at 18:48






      9




      9




      @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
      – Dan Bron
      Nov 14 at 20:49






      @Kevin If that is your goal, then it is much more readily and easily achieved by storing the ingredients in a table! Not to mention if, later, people would ... oh, I don't know, say, wish for recommended substitutes, or something silly like look for all recipes without any peanuts, or gluten, or animal proteins...
      – Dan Bron
      Nov 14 at 20:49






      9




      9




      @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
      – Kevin
      Nov 14 at 20:54






      @DanBron: YAGNI. Right now we're only using a list because it makes the UI logic easier. If we need or will need list-like behavior in the business logic layer, then it should be normalized into a separate table. Tables and joins are not necessarily expensive, but they're not free, and they bring in questions about element order ("Do we care about the order of ingredients?") and further normalization ("Are you going to turn '3 eggs' into ('eggs', 3)? What about 'Salt, to taste', is that ('salt', NULL)?").
      – Kevin
      Nov 14 at 20:54






      6




      6




      @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
      – Flater
      2 days ago






      @Kevin: YAGNI is quite wrong here. You yourself argued the necessity of being able to transform the list in many ways (HTML, markdown, JSON) and thus are arguing that you need the individual elements of the list. Unless the data storage and "list handling" applications are two applications that are developed independently (and do note that separate application layers != separate applications), the database structure should always be created to store the data in a format that leaves it readily available - while avoiding additional parsing/conversion logic.
      – Flater
      2 days ago












      up vote
      20
      down vote













      What you're describing is known as a "many to many" relationship, in your case between Person and Task. It's typically implemented using a third table, sometimes called a "link" or "cross-reference" table. For example:



      create table person (
      person_id integer primary key,
      ...
      );

      create table task (
      task_id integer primary key,
      ...
      );

      create table person_task_xref (
      person_id integer not null,
      task_id integer not null,
      primary key (person_id, task_id),
      foreign key (person_id) references person (person_id),
      foreign key (task_id) references task (task_id)
      );





      share|improve this answer

















      • 2




        You may also want to add an index with task_id first, if you might be doing queries filtered by task.
        – jpmc26
        yesterday










      • Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
        – jmoreno
        3 hours ago















      up vote
      20
      down vote













      What you're describing is known as a "many to many" relationship, in your case between Person and Task. It's typically implemented using a third table, sometimes called a "link" or "cross-reference" table. For example:



      create table person (
      person_id integer primary key,
      ...
      );

      create table task (
      task_id integer primary key,
      ...
      );

      create table person_task_xref (
      person_id integer not null,
      task_id integer not null,
      primary key (person_id, task_id),
      foreign key (person_id) references person (person_id),
      foreign key (task_id) references task (task_id)
      );





      share|improve this answer

















      • 2




        You may also want to add an index with task_id first, if you might be doing queries filtered by task.
        – jpmc26
        yesterday










      • Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
        – jmoreno
        3 hours ago













      up vote
      20
      down vote










      up vote
      20
      down vote









      What you're describing is known as a "many to many" relationship, in your case between Person and Task. It's typically implemented using a third table, sometimes called a "link" or "cross-reference" table. For example:



      create table person (
      person_id integer primary key,
      ...
      );

      create table task (
      task_id integer primary key,
      ...
      );

      create table person_task_xref (
      person_id integer not null,
      task_id integer not null,
      primary key (person_id, task_id),
      foreign key (person_id) references person (person_id),
      foreign key (task_id) references task (task_id)
      );





      share|improve this answer












      What you're describing is known as a "many to many" relationship, in your case between Person and Task. It's typically implemented using a third table, sometimes called a "link" or "cross-reference" table. For example:



      create table person (
      person_id integer primary key,
      ...
      );

      create table task (
      task_id integer primary key,
      ...
      );

      create table person_task_xref (
      person_id integer not null,
      task_id integer not null,
      primary key (person_id, task_id),
      foreign key (person_id) references person (person_id),
      foreign key (task_id) references task (task_id)
      );






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Nov 14 at 4:46









      Mike Partridge

      5,30711738




      5,30711738








      • 2




        You may also want to add an index with task_id first, if you might be doing queries filtered by task.
        – jpmc26
        yesterday










      • Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
        – jmoreno
        3 hours ago














      • 2




        You may also want to add an index with task_id first, if you might be doing queries filtered by task.
        – jpmc26
        yesterday










      • Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
        – jmoreno
        3 hours ago








      2




      2




      You may also want to add an index with task_id first, if you might be doing queries filtered by task.
      – jpmc26
      yesterday




      You may also want to add an index with task_id first, if you might be doing queries filtered by task.
      – jpmc26
      yesterday












      Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
      – jmoreno
      3 hours ago




      Also know as a bridge table. Also, wish I could give you an extra plus for not having an identity column, although I would recommend an index on each column.
      – jmoreno
      3 hours ago










      up vote
      11
      down vote














      ... it's never (or almost never) okay to store a list of IDs or the like in a field




      The only time you might store more than one data item in a single field is when that field is only ever used as a single entity and is never considered as being made up of those smaller elements. An example might be an image, stored in a BLOB field. It's made up of lots and lots of smaller elements (bytes) but these that mean nothing to the database and can only be used all together (and look pretty to an End User).



      Since a "list" is, by definition, made up of smaller elements (items), this isn't the case here and you should normalise the data.




      ... if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns ...




      No. You'll have a few rows in an Intersection Table (a.k.a. Weak Entity) between Person and Task. Databases are really good at working with lots of rows; they're actually pretty rubbish at working with lots of [repeated] columns.



      Nice clear example given by whatsisname.






      share|improve this answer

















      • 3




        When creating real life systems "never say never" is a very good rule to live by.
        – l0b0
        Nov 14 at 21:31






      • 1




        In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
        – supercat
        yesterday















      up vote
      11
      down vote














      ... it's never (or almost never) okay to store a list of IDs or the like in a field




      The only time you might store more than one data item in a single field is when that field is only ever used as a single entity and is never considered as being made up of those smaller elements. An example might be an image, stored in a BLOB field. It's made up of lots and lots of smaller elements (bytes) but these that mean nothing to the database and can only be used all together (and look pretty to an End User).



      Since a "list" is, by definition, made up of smaller elements (items), this isn't the case here and you should normalise the data.




      ... if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns ...




      No. You'll have a few rows in an Intersection Table (a.k.a. Weak Entity) between Person and Task. Databases are really good at working with lots of rows; they're actually pretty rubbish at working with lots of [repeated] columns.



      Nice clear example given by whatsisname.






      share|improve this answer

















      • 3




        When creating real life systems "never say never" is a very good rule to live by.
        – l0b0
        Nov 14 at 21:31






      • 1




        In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
        – supercat
        yesterday













      up vote
      11
      down vote










      up vote
      11
      down vote










      ... it's never (or almost never) okay to store a list of IDs or the like in a field




      The only time you might store more than one data item in a single field is when that field is only ever used as a single entity and is never considered as being made up of those smaller elements. An example might be an image, stored in a BLOB field. It's made up of lots and lots of smaller elements (bytes) but these that mean nothing to the database and can only be used all together (and look pretty to an End User).



      Since a "list" is, by definition, made up of smaller elements (items), this isn't the case here and you should normalise the data.




      ... if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns ...




      No. You'll have a few rows in an Intersection Table (a.k.a. Weak Entity) between Person and Task. Databases are really good at working with lots of rows; they're actually pretty rubbish at working with lots of [repeated] columns.



      Nice clear example given by whatsisname.






      share|improve this answer













      ... it's never (or almost never) okay to store a list of IDs or the like in a field




      The only time you might store more than one data item in a single field is when that field is only ever used as a single entity and is never considered as being made up of those smaller elements. An example might be an image, stored in a BLOB field. It's made up of lots and lots of smaller elements (bytes) but these that mean nothing to the database and can only be used all together (and look pretty to an End User).



      Since a "list" is, by definition, made up of smaller elements (items), this isn't the case here and you should normalise the data.




      ... if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns ...




      No. You'll have a few rows in an Intersection Table (a.k.a. Weak Entity) between Person and Task. Databases are really good at working with lots of rows; they're actually pretty rubbish at working with lots of [repeated] columns.



      Nice clear example given by whatsisname.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Nov 14 at 12:02









      Phill W.

      7,7083727




      7,7083727








      • 3




        When creating real life systems "never say never" is a very good rule to live by.
        – l0b0
        Nov 14 at 21:31






      • 1




        In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
        – supercat
        yesterday














      • 3




        When creating real life systems "never say never" is a very good rule to live by.
        – l0b0
        Nov 14 at 21:31






      • 1




        In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
        – supercat
        yesterday








      3




      3




      When creating real life systems "never say never" is a very good rule to live by.
      – l0b0
      Nov 14 at 21:31




      When creating real life systems "never say never" is a very good rule to live by.
      – l0b0
      Nov 14 at 21:31




      1




      1




      In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
      – supercat
      yesterday




      In many cases, the per-element cost of maintaining or retrieving a list in normalized form may vastly exceed the cost of keeping the items as a blob, since each item of the list would have to hold the identity of the master item with which it is associated and its location within the list in addition to the actual data. Even in cases where code might benefit from being able to update some list elements without updating the entire list, it might be cheaper to store everything as a blob and rewrite everything whenever one has to rewrite anything.
      – supercat
      yesterday










      up vote
      2
      down vote













      It may be legitimate in certain pre-calculated fields.



      If some of your queries are expensive and you decide to go with pre-calculated fields updated automatically using database triggers, then it may be legitimate to keep the lists inside a column.



      For example, in the UI you want to show this list using grid view, where each row can open full details (with complete lists) after double-clicking:



      REGISTERED USER LIST
      +------------------+----------------------------------------------------+
      |Name |Top 3 most visited tags |
      +==================+====================================================+
      |Peter |Design, Fitness, Gifts |
      +------------------+----------------------------------------------------+
      |Lucy |Fashion, Gifts, Lifestyle |
      +------------------+----------------------------------------------------+


      You are keeping the second column updated by trigger when client visits new article or by scheduled task.



      You can make such a field available even for searching (as normal text).



      For such cases, keeping lists is legitimate. You just need to consider case of possibly exceeding maximum field length.





      Also, if you are using Microsoft Access, offered multivalued fields are another special use case. They handle your lists in a field automatically.



      But you can always fall back to standard normalized form shown in other answers.





      Summary: Normal forms of database are theoretical model required for understanding important aspects of data modeling. But of course normalization does not take into account performance or other cost of retrieving the data. It is out of scope of that theoretical model. But storing lists or other pre-calculated (and controlled) duplicates is often required by practical implementation.



      In the light of the above, in practical implementation, would we prefer query relying on perfect normal form and running 20 seconds or equivalent query relying on pre-calculated values which takes 0.08 s? No one likes their software product to be accused of slowness.






      share|improve this answer



















      • 1




        It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
        – Loren Pechtel
        Nov 15 at 4:10










      • @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
        – miroxlav
        2 days ago












      • @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
        – Tezra
        yesterday










      • @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
        – Loren Pechtel
        yesterday















      up vote
      2
      down vote













      It may be legitimate in certain pre-calculated fields.



      If some of your queries are expensive and you decide to go with pre-calculated fields updated automatically using database triggers, then it may be legitimate to keep the lists inside a column.



      For example, in the UI you want to show this list using grid view, where each row can open full details (with complete lists) after double-clicking:



      REGISTERED USER LIST
      +------------------+----------------------------------------------------+
      |Name |Top 3 most visited tags |
      +==================+====================================================+
      |Peter |Design, Fitness, Gifts |
      +------------------+----------------------------------------------------+
      |Lucy |Fashion, Gifts, Lifestyle |
      +------------------+----------------------------------------------------+


      You are keeping the second column updated by trigger when client visits new article or by scheduled task.



      You can make such a field available even for searching (as normal text).



      For such cases, keeping lists is legitimate. You just need to consider case of possibly exceeding maximum field length.





      Also, if you are using Microsoft Access, offered multivalued fields are another special use case. They handle your lists in a field automatically.



      But you can always fall back to standard normalized form shown in other answers.





      Summary: Normal forms of database are theoretical model required for understanding important aspects of data modeling. But of course normalization does not take into account performance or other cost of retrieving the data. It is out of scope of that theoretical model. But storing lists or other pre-calculated (and controlled) duplicates is often required by practical implementation.



      In the light of the above, in practical implementation, would we prefer query relying on perfect normal form and running 20 seconds or equivalent query relying on pre-calculated values which takes 0.08 s? No one likes their software product to be accused of slowness.






      share|improve this answer



















      • 1




        It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
        – Loren Pechtel
        Nov 15 at 4:10










      • @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
        – miroxlav
        2 days ago












      • @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
        – Tezra
        yesterday










      • @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
        – Loren Pechtel
        yesterday













      up vote
      2
      down vote










      up vote
      2
      down vote









      It may be legitimate in certain pre-calculated fields.



      If some of your queries are expensive and you decide to go with pre-calculated fields updated automatically using database triggers, then it may be legitimate to keep the lists inside a column.



      For example, in the UI you want to show this list using grid view, where each row can open full details (with complete lists) after double-clicking:



      REGISTERED USER LIST
      +------------------+----------------------------------------------------+
      |Name |Top 3 most visited tags |
      +==================+====================================================+
      |Peter |Design, Fitness, Gifts |
      +------------------+----------------------------------------------------+
      |Lucy |Fashion, Gifts, Lifestyle |
      +------------------+----------------------------------------------------+


      You are keeping the second column updated by trigger when client visits new article or by scheduled task.



      You can make such a field available even for searching (as normal text).



      For such cases, keeping lists is legitimate. You just need to consider case of possibly exceeding maximum field length.





      Also, if you are using Microsoft Access, offered multivalued fields are another special use case. They handle your lists in a field automatically.



      But you can always fall back to standard normalized form shown in other answers.





      Summary: Normal forms of database are theoretical model required for understanding important aspects of data modeling. But of course normalization does not take into account performance or other cost of retrieving the data. It is out of scope of that theoretical model. But storing lists or other pre-calculated (and controlled) duplicates is often required by practical implementation.



      In the light of the above, in practical implementation, would we prefer query relying on perfect normal form and running 20 seconds or equivalent query relying on pre-calculated values which takes 0.08 s? No one likes their software product to be accused of slowness.






      share|improve this answer














      It may be legitimate in certain pre-calculated fields.



      If some of your queries are expensive and you decide to go with pre-calculated fields updated automatically using database triggers, then it may be legitimate to keep the lists inside a column.



      For example, in the UI you want to show this list using grid view, where each row can open full details (with complete lists) after double-clicking:



      REGISTERED USER LIST
      +------------------+----------------------------------------------------+
      |Name |Top 3 most visited tags |
      +==================+====================================================+
      |Peter |Design, Fitness, Gifts |
      +------------------+----------------------------------------------------+
      |Lucy |Fashion, Gifts, Lifestyle |
      +------------------+----------------------------------------------------+


      You are keeping the second column updated by trigger when client visits new article or by scheduled task.



      You can make such a field available even for searching (as normal text).



      For such cases, keeping lists is legitimate. You just need to consider case of possibly exceeding maximum field length.





      Also, if you are using Microsoft Access, offered multivalued fields are another special use case. They handle your lists in a field automatically.



      But you can always fall back to standard normalized form shown in other answers.





      Summary: Normal forms of database are theoretical model required for understanding important aspects of data modeling. But of course normalization does not take into account performance or other cost of retrieving the data. It is out of scope of that theoretical model. But storing lists or other pre-calculated (and controlled) duplicates is often required by practical implementation.



      In the light of the above, in practical implementation, would we prefer query relying on perfect normal form and running 20 seconds or equivalent query relying on pre-calculated values which takes 0.08 s? No one likes their software product to be accused of slowness.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited 2 days ago

























      answered Nov 15 at 0:21









      miroxlav

      444211




      444211








      • 1




        It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
        – Loren Pechtel
        Nov 15 at 4:10










      • @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
        – miroxlav
        2 days ago












      • @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
        – Tezra
        yesterday










      • @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
        – Loren Pechtel
        yesterday














      • 1




        It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
        – Loren Pechtel
        Nov 15 at 4:10










      • @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
        – miroxlav
        2 days ago












      • @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
        – Tezra
        yesterday










      • @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
        – Loren Pechtel
        yesterday








      1




      1




      It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
      – Loren Pechtel
      Nov 15 at 4:10




      It can be legitimate even without precalculated stuff. I've done it a couple of times where the data is stored properly but for performance reasons it's useful to stuff a few cached results in the main records.
      – Loren Pechtel
      Nov 15 at 4:10












      @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
      – miroxlav
      2 days ago






      @LorenPechtel – Yes, thanks, in my use of term pre-calculated I also include cases of cached values stored where needed. In systems with complex dependencies, they are the way to keep the performance normal. And if programmed with adequate know-how, these values are reliable and always-in-sync. I just did not want to add case of caching into the answer to keep the answer simple and on safe side. It got downvoted anyway. :)
      – miroxlav
      2 days ago














      @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
      – Tezra
      yesterday




      @LorenPechtel Actually, that would still be a bad reason... cache data should be kept in an intermediate cache store, and while the cache is still valid, that query should never hit the main DB.
      – Tezra
      yesterday












      @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
      – Loren Pechtel
      yesterday




      @Tezra No, I'm saying that sometimes a piece of data from a secondary table is needed often enough to make it make sense to put a copy in the main record. (Example that I have done--the employee table includes the last time in and the last time out. They are used only for display purposes, any actual calculation comes from the table with the clock-in/clock-out records.)
      – Loren Pechtel
      yesterday










      up vote
      0
      down vote













      Given two tables; we'll call them Person and Task, each with it's own ID (PersonID, TaskID)... the basic idea is to create a third table to bind them together. We'll call this table PersonToTask. At the minimum it should have it's own ID, as well as the two others
      So when it comes to assigning someone to a task; you will no longer need to UPDATE the Person table, you just need to INSERT a new line into the PersonToTaskTable.
      And maintenance becomes easier- need to delete a task just becomes a DELETE based on TaskID, no more updating the Person table and it's associated parsing



      CREATE TABLE dbo.PersonToTask (
      pttID INT IDENTITY(1,1) NOT NULL,
      PersonID INT NULL,
      TaskID INT NULL
      )

      CREATE PROCEDURE dbo.Task_Assigned (@PersonID INT, @TaskID INT)
      AS
      BEGIN
      INSERT PersonToTask (PersonID, TaskID)
      VALUES (@PersonID, @TaskID)
      END

      CREATE PROCEDURE dbo.Task_Deleted (@TaskID INT)
      AS
      BEGIN
      DELETE PersonToTask WHERE TaskID = @TaskID
      DELETE Task WHERE TaskID = @TaskID
      END


      How about a simple report or who's all assigned to a task?



      CREATE PROCEDURE dbo.Task_CurrentAssigned (@TaskID INT)
      AS
      BEGIN
      SELECT PersonName
      FROM dbo.Person
      WHERE PersonID IN (SELECT PersonID FROM dbo.PersonToTask WHERE TaskID = @TaskID)
      END


      You of course could do a lot more; a TimeReport could be done if you added DateTime fields for TaskAssigned and TaskCompleted. It's all up to you






      share|improve this answer








      New contributor




      Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















        up vote
        0
        down vote













        Given two tables; we'll call them Person and Task, each with it's own ID (PersonID, TaskID)... the basic idea is to create a third table to bind them together. We'll call this table PersonToTask. At the minimum it should have it's own ID, as well as the two others
        So when it comes to assigning someone to a task; you will no longer need to UPDATE the Person table, you just need to INSERT a new line into the PersonToTaskTable.
        And maintenance becomes easier- need to delete a task just becomes a DELETE based on TaskID, no more updating the Person table and it's associated parsing



        CREATE TABLE dbo.PersonToTask (
        pttID INT IDENTITY(1,1) NOT NULL,
        PersonID INT NULL,
        TaskID INT NULL
        )

        CREATE PROCEDURE dbo.Task_Assigned (@PersonID INT, @TaskID INT)
        AS
        BEGIN
        INSERT PersonToTask (PersonID, TaskID)
        VALUES (@PersonID, @TaskID)
        END

        CREATE PROCEDURE dbo.Task_Deleted (@TaskID INT)
        AS
        BEGIN
        DELETE PersonToTask WHERE TaskID = @TaskID
        DELETE Task WHERE TaskID = @TaskID
        END


        How about a simple report or who's all assigned to a task?



        CREATE PROCEDURE dbo.Task_CurrentAssigned (@TaskID INT)
        AS
        BEGIN
        SELECT PersonName
        FROM dbo.Person
        WHERE PersonID IN (SELECT PersonID FROM dbo.PersonToTask WHERE TaskID = @TaskID)
        END


        You of course could do a lot more; a TimeReport could be done if you added DateTime fields for TaskAssigned and TaskCompleted. It's all up to you






        share|improve this answer








        New contributor




        Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.




















          up vote
          0
          down vote










          up vote
          0
          down vote









          Given two tables; we'll call them Person and Task, each with it's own ID (PersonID, TaskID)... the basic idea is to create a third table to bind them together. We'll call this table PersonToTask. At the minimum it should have it's own ID, as well as the two others
          So when it comes to assigning someone to a task; you will no longer need to UPDATE the Person table, you just need to INSERT a new line into the PersonToTaskTable.
          And maintenance becomes easier- need to delete a task just becomes a DELETE based on TaskID, no more updating the Person table and it's associated parsing



          CREATE TABLE dbo.PersonToTask (
          pttID INT IDENTITY(1,1) NOT NULL,
          PersonID INT NULL,
          TaskID INT NULL
          )

          CREATE PROCEDURE dbo.Task_Assigned (@PersonID INT, @TaskID INT)
          AS
          BEGIN
          INSERT PersonToTask (PersonID, TaskID)
          VALUES (@PersonID, @TaskID)
          END

          CREATE PROCEDURE dbo.Task_Deleted (@TaskID INT)
          AS
          BEGIN
          DELETE PersonToTask WHERE TaskID = @TaskID
          DELETE Task WHERE TaskID = @TaskID
          END


          How about a simple report or who's all assigned to a task?



          CREATE PROCEDURE dbo.Task_CurrentAssigned (@TaskID INT)
          AS
          BEGIN
          SELECT PersonName
          FROM dbo.Person
          WHERE PersonID IN (SELECT PersonID FROM dbo.PersonToTask WHERE TaskID = @TaskID)
          END


          You of course could do a lot more; a TimeReport could be done if you added DateTime fields for TaskAssigned and TaskCompleted. It's all up to you






          share|improve this answer








          New contributor




          Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          Given two tables; we'll call them Person and Task, each with it's own ID (PersonID, TaskID)... the basic idea is to create a third table to bind them together. We'll call this table PersonToTask. At the minimum it should have it's own ID, as well as the two others
          So when it comes to assigning someone to a task; you will no longer need to UPDATE the Person table, you just need to INSERT a new line into the PersonToTaskTable.
          And maintenance becomes easier- need to delete a task just becomes a DELETE based on TaskID, no more updating the Person table and it's associated parsing



          CREATE TABLE dbo.PersonToTask (
          pttID INT IDENTITY(1,1) NOT NULL,
          PersonID INT NULL,
          TaskID INT NULL
          )

          CREATE PROCEDURE dbo.Task_Assigned (@PersonID INT, @TaskID INT)
          AS
          BEGIN
          INSERT PersonToTask (PersonID, TaskID)
          VALUES (@PersonID, @TaskID)
          END

          CREATE PROCEDURE dbo.Task_Deleted (@TaskID INT)
          AS
          BEGIN
          DELETE PersonToTask WHERE TaskID = @TaskID
          DELETE Task WHERE TaskID = @TaskID
          END


          How about a simple report or who's all assigned to a task?



          CREATE PROCEDURE dbo.Task_CurrentAssigned (@TaskID INT)
          AS
          BEGIN
          SELECT PersonName
          FROM dbo.Person
          WHERE PersonID IN (SELECT PersonID FROM dbo.PersonToTask WHERE TaskID = @TaskID)
          END


          You of course could do a lot more; a TimeReport could be done if you added DateTime fields for TaskAssigned and TaskCompleted. It's all up to you







          share|improve this answer








          New contributor




          Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|improve this answer



          share|improve this answer






          New contributor




          Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered Nov 14 at 19:26









          Mad Myche

          1012




          1012




          New contributor




          Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          Mad Myche is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






















              up vote
              0
              down vote













              It may work if say you have human readable Primary keys and want a list of task #'s without having to deal with vertical nature of a table structure. i.e. much easier to read first table.



              ------------------------  
              Employee Name | Task
              Jack | 1,2,5
              Jill | 4,6,7
              ------------------------

              ------------------------
              Employee Name | Task
              Jack | 1
              Jack | 2
              Jack | 5
              Jill | 4
              Jill | 6
              Jill | 7
              ------------------------


              The question would then be: should the task list be stored or generated on demand, which largely would depend on requirements such as: how often the list are needed, how accurate how many data rows exist, how the data will be used, etc... after which analyzing the trade offs to user experience and meeting requirements should be done.



              For example comparing the time it would take to recall the 2 rows vs running a query that would generate the 2 rows. If it takes long and the user does not need the most up to date list(*expecting less than 1 change per day) then it could be stored.



              Or if the user needs a historical record of tasks assigned to them it would also make sense if the list was stored. So it really depends on what you are doing, never say never.






              share|improve this answer








              New contributor




              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.


















              • As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
                – Phill W.
                2 days ago















              up vote
              0
              down vote













              It may work if say you have human readable Primary keys and want a list of task #'s without having to deal with vertical nature of a table structure. i.e. much easier to read first table.



              ------------------------  
              Employee Name | Task
              Jack | 1,2,5
              Jill | 4,6,7
              ------------------------

              ------------------------
              Employee Name | Task
              Jack | 1
              Jack | 2
              Jack | 5
              Jill | 4
              Jill | 6
              Jill | 7
              ------------------------


              The question would then be: should the task list be stored or generated on demand, which largely would depend on requirements such as: how often the list are needed, how accurate how many data rows exist, how the data will be used, etc... after which analyzing the trade offs to user experience and meeting requirements should be done.



              For example comparing the time it would take to recall the 2 rows vs running a query that would generate the 2 rows. If it takes long and the user does not need the most up to date list(*expecting less than 1 change per day) then it could be stored.



              Or if the user needs a historical record of tasks assigned to them it would also make sense if the list was stored. So it really depends on what you are doing, never say never.






              share|improve this answer








              New contributor




              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.


















              • As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
                – Phill W.
                2 days ago













              up vote
              0
              down vote










              up vote
              0
              down vote









              It may work if say you have human readable Primary keys and want a list of task #'s without having to deal with vertical nature of a table structure. i.e. much easier to read first table.



              ------------------------  
              Employee Name | Task
              Jack | 1,2,5
              Jill | 4,6,7
              ------------------------

              ------------------------
              Employee Name | Task
              Jack | 1
              Jack | 2
              Jack | 5
              Jill | 4
              Jill | 6
              Jill | 7
              ------------------------


              The question would then be: should the task list be stored or generated on demand, which largely would depend on requirements such as: how often the list are needed, how accurate how many data rows exist, how the data will be used, etc... after which analyzing the trade offs to user experience and meeting requirements should be done.



              For example comparing the time it would take to recall the 2 rows vs running a query that would generate the 2 rows. If it takes long and the user does not need the most up to date list(*expecting less than 1 change per day) then it could be stored.



              Or if the user needs a historical record of tasks assigned to them it would also make sense if the list was stored. So it really depends on what you are doing, never say never.






              share|improve this answer








              New contributor




              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              It may work if say you have human readable Primary keys and want a list of task #'s without having to deal with vertical nature of a table structure. i.e. much easier to read first table.



              ------------------------  
              Employee Name | Task
              Jack | 1,2,5
              Jill | 4,6,7
              ------------------------

              ------------------------
              Employee Name | Task
              Jack | 1
              Jack | 2
              Jack | 5
              Jill | 4
              Jill | 6
              Jill | 7
              ------------------------


              The question would then be: should the task list be stored or generated on demand, which largely would depend on requirements such as: how often the list are needed, how accurate how many data rows exist, how the data will be used, etc... after which analyzing the trade offs to user experience and meeting requirements should be done.



              For example comparing the time it would take to recall the 2 rows vs running a query that would generate the 2 rows. If it takes long and the user does not need the most up to date list(*expecting less than 1 change per day) then it could be stored.



              Or if the user needs a historical record of tasks assigned to them it would also make sense if the list was stored. So it really depends on what you are doing, never say never.







              share|improve this answer








              New contributor




              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              share|improve this answer



              share|improve this answer






              New contributor




              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              answered Nov 14 at 19:46









              Double E CPU

              11




              11




              New contributor




              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.





              New contributor





              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              Double E CPU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.












              • As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
                – Phill W.
                2 days ago


















              • As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
                – Phill W.
                2 days ago
















              As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
              – Phill W.
              2 days ago




              As you say, it all depends on how the data is to be retrieved. If you /only/ ever query this table by User Name, then the "list" field is perfectly adequate. However, how can you query such a table to find out who is working on Task #1234567 and still keep it performant? Just about every kind of "find-X-anywhere-in-the-field" String function will cause such a query to /Table Scan/, slowing things to a crawl. With properly normalised, properly indexed data, that just doesn't happen.
              – Phill W.
              2 days ago










              up vote
              0
              down vote













              You're taking what should be another table, turning it through 90 degrees and shoehorning it into another table.



              It's like having an order table where you have itemProdcode1, itemQuantity1, itemPrice1 ... itemProdcode37, itemQuantity37, itemPrice37. Apart from being awkward to handle programmatically you can guarantee that tomorrow someone will want to order 38 things.



              I'd only do it your way if the 'list' isn't really a list, i.e. where it stands as a whole and each individual line item doesn't refer to some clear and independent entity. In that case just stuff it all in some data type that's big enough.



              So an order is a list, a Bill Of Materials is a list (or a list of lists, which would be even more of a nightmare to implement "sideways"). But a note/comment and a poem aren't.






              share|improve this answer








              New contributor




              Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






















                up vote
                0
                down vote













                You're taking what should be another table, turning it through 90 degrees and shoehorning it into another table.



                It's like having an order table where you have itemProdcode1, itemQuantity1, itemPrice1 ... itemProdcode37, itemQuantity37, itemPrice37. Apart from being awkward to handle programmatically you can guarantee that tomorrow someone will want to order 38 things.



                I'd only do it your way if the 'list' isn't really a list, i.e. where it stands as a whole and each individual line item doesn't refer to some clear and independent entity. In that case just stuff it all in some data type that's big enough.



                So an order is a list, a Bill Of Materials is a list (or a list of lists, which would be even more of a nightmare to implement "sideways"). But a note/comment and a poem aren't.






                share|improve this answer








                New contributor




                Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.




















                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  You're taking what should be another table, turning it through 90 degrees and shoehorning it into another table.



                  It's like having an order table where you have itemProdcode1, itemQuantity1, itemPrice1 ... itemProdcode37, itemQuantity37, itemPrice37. Apart from being awkward to handle programmatically you can guarantee that tomorrow someone will want to order 38 things.



                  I'd only do it your way if the 'list' isn't really a list, i.e. where it stands as a whole and each individual line item doesn't refer to some clear and independent entity. In that case just stuff it all in some data type that's big enough.



                  So an order is a list, a Bill Of Materials is a list (or a list of lists, which would be even more of a nightmare to implement "sideways"). But a note/comment and a poem aren't.






                  share|improve this answer








                  New contributor




                  Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  You're taking what should be another table, turning it through 90 degrees and shoehorning it into another table.



                  It's like having an order table where you have itemProdcode1, itemQuantity1, itemPrice1 ... itemProdcode37, itemQuantity37, itemPrice37. Apart from being awkward to handle programmatically you can guarantee that tomorrow someone will want to order 38 things.



                  I'd only do it your way if the 'list' isn't really a list, i.e. where it stands as a whole and each individual line item doesn't refer to some clear and independent entity. In that case just stuff it all in some data type that's big enough.



                  So an order is a list, a Bill Of Materials is a list (or a list of lists, which would be even more of a nightmare to implement "sideways"). But a note/comment and a poem aren't.







                  share|improve this answer








                  New contributor




                  Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  share|improve this answer



                  share|improve this answer






                  New contributor




                  Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered Nov 14 at 23:06









                  Bloke Down The Pub

                  1




                  1




                  New contributor




                  Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  Bloke Down The Pub is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.

















                      protected by gnat Nov 15 at 5:28



                      Thank you for your interest in this question.
                      Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                      Would you like to answer one of these unanswered questions instead?



                      Popular posts from this blog

                      flock() on closed filehandle LOCK_FILE at /usr/bin/apt-mirror

                      Mangá

                      Eduardo VII do Reino Unido