New site development update - Jan 2010

Announcements about the database, website, and plugins.
Havrest
Former Moderator
Posts: 79
Joined: Sun Nov 08, 2009 2:12 pm
Location: France

Re: New site development update - Jan 2010

Postby Havrest » Mon Mar 01, 2010 7:25 pm

Hi,

I'm working on a project in which I'd like to do some data versioning just like it will be on the new website (if I read correctly). I'm not yet started with the development (PHP/mysql) and I like to share/confront my ideas with some existing solution/concept when it's possible.
So I was wondering if you mind sharing with me the logic behind the new TvDb versioning system ?

Just so you know what I'm talking about, I would store versions of subtitles lines. I was thinking about a really simple way to do that : store every lines (old or current) in the same table with additional fields to the old ones to link them with the current one. But I was wondering if there were a smarter way to do that.
I read stuff about using triggers and a revision table, it looks fine too (better I mean :p ).

I'll really appreciate if you take time to answer to me but if you don't I'll totally understand.

Ps: You can answer by PM if you want to but it might be interesting for others.

Thank you.

szsori
Site Admin
Posts: 1911
Joined: Fri Nov 03, 2006 5:23 pm

Re: New site development update - Jan 2010

Postby szsori » Mon Mar 01, 2010 11:20 pm

Sure, I can explain. If you're only looking for versioning on a single table then what you've described is best. It also works best when you're working with a fresh schema and wouldn't have to rewrite a ton of code if you have duplicate records in a single table. The main issue with using a single table is that your primary key can't be used for anything other than adding/removing revisions. If that's not a concern, just add an automatic timestamp to the table and use the most recent one.

I don't have versioning pinned down completely in the new schema yet, but I believe it will be far more complex for me. We're versioning at least 15 tables and some of them are immense. Since there are few instances we actually need to access the versioning information, I'm considering making a duplicate of our database that removes the primary key restraints. Then, when a record is updated I'll either use a trigger or stored proc to copy the previous record to the extra database before writing the new data. Rolling back a record would consist of doing that again, but overwriting the current record with the old information. It's a bit complex but will work for all but one or two of our tables. The place it gets tricky is when we're doing a crossjoin table like the one I'm using for our flexible seasons. I may have to make a duplicate of ALL of those records and set the date on all of them to match. Otherwise I'd have to convert those to a simple CSV format and store in a table on the extra database, but that's somewhat poor database design although it does clean things up considerably. In other words, I'm not sure on that part yet. ;)

Also, are these subtitles for tv shows? Any interest in building them directly into our site? ;)

Havrest
Former Moderator
Posts: 79
Joined: Sun Nov 08, 2009 2:12 pm
Location: France

Re: New site development update - Jan 2010

Postby Havrest » Tue Mar 02, 2010 4:11 am

Ok. Thank you for your answer. You're right, that's a bit more complex for you.
For now I just thought for versioning on only one table so I think I'll go with the single table as my primary key won't represent anything for the entries which only exist as a part of a subtitle.

Yes, these subtitles are for tv shows. I'm thinking of creating a website on which teams who are creating subtitles for tv shows will find tools to make their job easier. And another thing I was considering was "social subtitles". Meaning collaborative writing of subtitles. Even if this part is not yet completely clear in my head now. Users could edit/correct existing subtitles (and the result automatically saved to the subtitle or stored as a suggestion if the subtitle creator doesn't allow it), create a new subtitle and allow a team/some users/everyone to participate in the subtitle creation. Etc. I just have to add some rules and constraints so we don't see only Fastsub on the website. (like: normal users couldn't create subtitles that legit teams are already doing, etc.) I was thinking of a point system or something like that.
The project is not really started. I'm writing down my ideas and hope to start soon but as I'm the only one working on that (and I want to keep it that way for now) and only on my spare time. I can't even think of a "release date" (if I ever finish it ^^).
Of course I woud be interested in making the subtitles available on thetvdb (it would be fair as I would use the API to retrieve tv shows informations) but as I said I can't say when it will be done. So I think the best is to talk about it again when the development is almost over.
(One thing I didn't think about is that I first thought about the website in an "french only" website so if I open it to every language I'll have to rethink some of my database schemes. Better now than when really started the code anyway)
So, I'll think about it but I don't make promises ;)

Any thoughts on the global idea ?

szsori
Site Admin
Posts: 1911
Joined: Fri Nov 03, 2006 5:23 pm

Re: New site development update - Jan 2010

Postby szsori » Tue Mar 02, 2010 9:18 am

I'd strongly recommend that you go the multilingual route right away. Even if you don't support it initially, make the code and schema support it. Adding it on later is one of the reasons we're rewriting this site... it just makes a big mess. You'd be surprised how appreciative people are when you make something they can use for their language, even if the user support isn't there yet. Our foreign users add a TON of custom content here.

Also, what we could always do later on is link the subs in from your site similar to how we link in IMDB, TV.com, etc. If you're planning on relating your subtitles to our series and episode id's, then make sure you store those in the database and we could link between the sites really easily. I don't necessarily think every related site should be assimilated into this one, but this is definitely a case where making it easier to automatically relate the records will make things easier on end users. :)

If you have any more questions, just let me know. Also, I'd recommend looking at CakePHP. It doesn't really work for our site, but for one like yours it could really speed development time.

theFingerofGod
Posts: 48
Joined: Tue Jun 02, 2009 3:27 am

Re: New site development update - Jan 2010

Postby theFingerofGod » Thu Mar 18, 2010 6:48 am

I just wanted to say that I really love reading this stuff. Do I understand half of what I read? NO! But when I do manage to parse some of what I read it makes me understand how complex this undertaking is. This in turn reminds me of how thankful I should be that anyone is undertaking something so immense!

I can't WAIT to see the new site!

Good Luck!

Best,

A theTVDb Fan!

Alanoll
Posts: 21
Joined: Tue Oct 14, 2008 11:45 am

Re: New site development update - Jan 2010

Postby Alanoll » Tue Mar 23, 2010 2:28 pm

In regards to versioning, have you considered using a more hierarchical approach?
Meaning...you have a Series, with a Series ID, and perhaps a timestamp of creation. You then want to version the series for any edits such as genre, timeslots, etc, the series specific information that a user can change.

You could then have two tables: a header table and then the actual tables containing all possible versions.

Code: Select all

SELECT
  *
FROM   Series    as S
INNER JOIN vSeries as vS
             ON vS.SeriesID = S.SeriesID
           AND vS.SeriesHeaderID = (SELECT TOP 1 vSH.SeriesHeaderID
                                                  FROM   vSeries as vS2
                                                  INNER JOIN vSeriesHdr AS vSH
                                                              ON vSH.SeriesHeaderID = vS2.SeriesHeaderID
                                                  WHERE vS2.SeriesID = vS.SeriesID
                                                  ORDER BY vSH.SeriesHeaderID DESC)
WHERE S.SeriesID = 9999999


Following that logic, would allow you to then version the episodes contained within the series/seasons and so forth. With the correct index, the performance should be rather good.

szsori
Site Admin
Posts: 1911
Joined: Fri Nov 03, 2006 5:23 pm

Re: New site development update - Jan 2010

Postby szsori » Tue Mar 23, 2010 4:40 pm

That's a really interesting idea. I'm going to have to think that through a bit since there are some instances where the data might not allow that method, in which case we'd need to use a different method. Our tables would need reworking as well, which I'm not sure I'd want to do at this point.

Alanoll
Posts: 21
Joined: Tue Oct 14, 2008 11:45 am

Re: New site development update - Jan 2010

Postby Alanoll » Tue Mar 23, 2010 8:28 pm

yeah, depending on how far you've gotten, it may be more trouble then it's worth.

I've got no idea of the current schema, or even the new one for that matter, but I find that style of versioning allows for more flexibility in terms of points in time then others I've seen. As in, if you wanted to see how the series looked as of version 4, you could add that as a limiting factor within the subquery.

I'm a developer on a product that uses versioning on a daily basis, so I'm more than happy to assist in any way.