Frustrating Times: When datetime Crashes the Party in a Fabric Notebook (ENG)

Published on December 23, 2025 at 4:43 PM

Last week, I found myself wrestling with an issue for my client. The mission? Rebuild a Spark SQL query to create a finance table for reporting their organizational formation. Sounds straightforward, right?

Spoiler alert: it wasn’t.

 

To present the data, we report everything against a specific organizational structure called an organigram. Nothing fancy, just your typical hierarchy chart. But then the trouble began.The goal was simple. Find the correct organigram for the active period of the formation value.
In other words:

  • Which organigram belongs to code 20000 during the time period 2023-01-01 to 2099-12-31?

This SQL looks as follows:

 

left join lh_enriched.vw_dim_organigram org

on beheer.code_oe = org.OrganisatorischeEenheid

AND dd_eind_formatieplaats between BeginOrganigram and EindOrganigram

 

However, things didn’t go as planned. Instead of the expected results, I got… NULLs. Everywhere. My join apparently decided to ghost me. Strange, because I know the data is there. I double-checked if every date field is a timestamp in Spark SQL. And, above all, I clearly see values for code 20000 in the right time window. Yet, the result? A big fat pile of NULLs.

At this point, I felt like a detective in a crime scene where all the suspects are timestamps. So, I tried to force the join to behave by explicitly mapping the timestamps like this:

 

to_timestamp(coalesce(beheer.dd_eind_formatieplaats, '2099-12-31'))

between to_timestamp(org.BeginOrganigram) and to_timestamp(org.EindOrganigram)

 

Although, this still gave me a big fat pile of NULLs!

Here’s where it got really weird: I ran the same query in two separate Fabric notebook sessions… and got two different results. One session happily returned data. The other? Still drowning in NULLs. At this point, I was questioning everything: my notebook settings, my Spark settings. Could some obscure configuration be sabotaging me?

After some deep diving (and a few cups of coffee), I discovered the likely culprit:

 

  • ISO 8601 strings with T/Z are not parsed by a plain CAST(... AS DATE).

 

Yep, my innocent-looking CAST('2019-01-01T00:00:00Z' AS DATE) was silently returning NULL. Sneaky.

I had been parsing values with to_timestamp(...) or casting to DATE, but Spark wanted something more specific: an explicit format.

As an example:

 

CAST('2019-01-01T00:00:00Z' AS DATE) 

needs to be:

to_date(col, "yyyy-MM-dd'T'HH:mm:ssX").

 

So eventually, I ended up with a spark sql join condition that is the following:

 

   on coalesce(to_date(eind_formatieplaats, "yyyy-MM-dd'T'HH:mm:ssX"),  date'2099-12-31')

   BETWEEN

   to_date(BeginOrganigram, "yyyy-MM-dd'T'HH:mm:ssX")

   AND

   coalesce(to_date(EindOrganigram, "yyyy-MM-dd'T'HH:mm:ssX"),   date'9999-12-31')

 

Once I ran the query, I finally got the expected results. Me happy, client happy!  But let’s be honest, some frustrating hours had passed.

Now the report delivers the correct data to power HR insights. Mission accomplished! It just took a little longer than I had planned. And that’s the thing: no matter how long you’ve worked with data platforms. or even in Fabric, there will always be challenges that look simple on the surface but end up eating more time than you’d like.

In the end, the client is happy, and I learned something new this week.

Let’s see what next week’s challenge will be. Until then… happy coding!