Finding the Day Difference Between Two Dates in C

Necessity is the mother of invention, and nowhere is that more true than in the realm of programming. Many of my best ideas for programming projects come not from casually looking for project ideas on the Internet (although that does help) but rather from having an emerging need for a bit of functionality that I currently lack, and realizing that I have the ability to implement that functionality with my current skill set.

I’m currently writing a system of custom software that I will be able to use to analyze stats on my WordPress pages (I know I’m getting into meta territory here). I’m trying to figure out what kinds of posts people like the most so I can focus on those. It’s kind of a super-autistic way of making my blog grow (I am a diagnosed autistic BTW, so don’t anyone get offended). One of the tasks I’m attempting to solve involves analyzing a flat-file database in C. I would use awk to do this, but unfortunately this task involves getting the date, and there doesn’t appear to be any way to do that in awk. At first I was considering writing my own version of awk that adds the features I need, then I decided to just get the source code for awk and modify it to give it the needed features. Problem is looking over a program that big and figuring out which pieces do what and how they all fit together could take days, even weeks, and I want a solution now. So for the time being I have to write the entire thing in C.

That brings me to the titular problem. I’m writing a program that reads a flat-file database one record at a time, extracts the information from that record, and performs a calculation with it. Particularly, it calculates a formula that I came up with to rank by own blog posts by popularity: (Pageviews + Likes + Comments from other people)/(Age in days + 5). The 5 is added on to account for statistical anomalies with pages that are less than a couple days old, where the low denominator would have them ranked much higher than older pages even when they shouldn’t be.

I think I like the format I used in this article, where I printed my entire code and then went through it line-by-line and explained what each part does, so I will repeat that format here.


 1 /*************************************
 2  * Artirate - Article rating program *
 3  * Version: Alpha                    *
 4  * Author: Michael Warren            *
 5  * Date: Feb 26 2019                 *
 6  * Instructions: Pipe into sort -n   *
 7  *************************************/
 8 
 9 #include <stdio.h>
10 #include <stdlib.h>
11 #include <string.h>
12 #include <time.h>
13 
14 /* Derives a day-of-year from day-of-month and month */
15 int get_yday( int day, int month ){
16         month--;
17         int ndays[] = { 312831303130313130313031 };
18         forint i = 0; i < month; i++ ){
19                 day += ndays[i];
20         }
21         return day;
22 }
23 
24 int mainint argc, char **argv ){
25         FILE *fp = fopen( argv[1], "r" );
26         int c;
27         char buf[80];
28         fgets( buf, 80, fp ); // Go past the heading
29         while( (c = fgetc( fp )) != EOF ){
30                 ungetc( c, fp );
31 
32                 /* Get information */
33                 fgets( buf, 80, fp );
34                 if( buf[0] == '\n' ){
35                         fclose( fp );
36                         return 0;
37                 }
38                 char *title = strtok( buf, "\t\n" );
39                 int views = atoistrtokNULL"\t\n" ) );
40                 int comments = atoistrtokNULL"\t\n" ) );
41                 int likes = atoistrtokNULL"\t\n" ) );
42                 char *date = strtokNULL"\t\n" );
43                 int year = atoistrtok( date, "-" ) );
44                 int month = atoistrtokNULL"-" ) );
45                 int day = atoistrtokNULL"-" ) );
46 
47                 /* Calculate how long ago */
48                 int ago;
49                 time_t ts = timeNULL );
50                 struct tm *curdate = localtime( &ts );
51                 curdate->tm_year += 1900;
52                 curdate->tm_mon++;
53                 curdate->tm_yday++;
54                 if( curdate->tm_year == year ){
55                         if( curdate->tm_mon == month ){
56                         // Same year, same month
57                                 ago = curdate->tm_mday - day;
58                         }
59                         else{
60                         // Same year, different month
61                                 ago = curdate->tm_yday - get_yday( day, month );
62                         }
63                 }
64                 else// Different year
65                         ago = (curdate->tm_year - year) * 365;
66                         /* yday_ago is the difference between year days */
67                         /* It will typically be a negative value.       */
68                         int yday_ago = curdate->tm_yday - get_yday( day, month );
69                         ago += ( curdate->tm_mon > month ) ? -yday_ago : yday_ago;
70                 }
71 
72                 /* Calculate final rating and print */
73                 float rating = ((float ) (views + comments + likes))/((float) ago + 5);
74                 printf"%2.3f\t%s\n", rating, title );
75         }
76         fclose( fp );
77         return 0;
78 }

I will focus my attention on lines 15-21 and lines 48-70 – the parts of the program that calculate the day difference between dates. These are the only parts that are particularly interesting. I’ve never actually attempted to solve this particular problem, and I don’t mean to toot my own horn, but some parts of my algorithm are pretty clever.

The goal here is to find the difference between the current date and the date given in the record. The recorded date is split into year, month, and day components – all integers. The current date is derived by first getting the timestamp (line 49) and then converting it into the more convenient struct tm format (line 50). The fields of the struct are adjusted so they match the format of the date given in the file (lines 51-53).

The next several lines of code are fairly self-explanatory. I merely created three different schemes based on whether the two dates are in the same month of the same year, different months of the same year, or different years. The last of these (lines 65-69) starts with the year difference times 365 (Notice I haven’t done any adjustments for leap years. Since precision isn’t important here I didn’t see much need for such adjustments, though I might add them later just for the lulz.), and then adds or subtracts the margin of error from that.

Now we get to the part I’m most proud of – the part that determines the difference between two days if the months are different. I think my solution was somewhat clever. Some people may think it’s obvious, but I personally think it’s clever, especially considering that my head is all congested right now and I can’t think too clearly. Basically what I did was I calculated the day-of-year based on the day-of-month and the current month, and then subtracted between that and the curdate->tm_yday variable. To calculate the day-of-year, I created an integer array containing the lengths of all the months, then added them in succession while counting up to the current month (see lines 15-22).

Anyway, here’s the file I used this program on just for reference:


Article:        Views:          Comments:       Likes:          Date:
Intro           0               0               0               2019-02-16
Apple to Lenovo 9               2               2               2019-02-18
Starting Arch   4               0               0               2019-02-18
VirtualBox      4               0               2               2019-02-19
DoD algorithm   4               1               0               2019-02-21
Idiosyncrasies  0               1               0               2019-02-22
Minimalism      6               0               3               2019-02-22
PHP tutorial 1  2               0               0               2019-02-24
PHP tutorial 2  2               0               0               2019-02-25
Dimensions      2               0               1               2019-02-25

The numbers are admittedly lower than I would like them to be, especially since I’m now displaying them for the entire world, but this is only my second week here, so it’s not that much of an embarrassment. 😛 My output looks something like this:


$ ./artirate post-stats.db | sort -nr
1.000   Minimalism
1.000   Apple to Lenovo
0.500   VirtualBox
0.500   DoD algorithm
0.500   Dimensions
0.333   PHP tutorial 2
0.308   Starting Arch
0.286   PHP tutorial 1
0.111   Idiosyncrasies
0.000   Intro

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s