File size: 17,907 Bytes
c011401
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
.. currentmodule:: numpy

.. _arrays.datetime:

************************
Datetimes and Timedeltas
************************

.. versionadded:: 1.7.0

Starting in NumPy 1.7, there are core array data types which natively
support datetime functionality. The data type is called "datetime64",
so named because "datetime" is already taken by the datetime library
included in Python.

.. note:: The datetime API is *experimental* in 1.7.0, and may undergo changes
   in future versions of NumPy.

Basic Datetimes
===============

The most basic way to create datetimes is from strings in
ISO 8601 date or datetime format. The unit for internal storage
is automatically selected from the form of the string, and can
be either a :ref:`date unit <arrays.dtypes.dateunits>` or a
:ref:`time unit <arrays.dtypes.timeunits>`. The date units are years ('Y'),
months ('M'), weeks ('W'), and days ('D'), while the time units are
hours ('h'), minutes ('m'), seconds ('s'), milliseconds ('ms'), and
some additional SI-prefix seconds-based units.

.. admonition:: Example

    A simple ISO date:

    >>> np.datetime64('2005-02-25')
    numpy.datetime64('2005-02-25')

    Using months for the unit:

    >>> np.datetime64('2005-02')
    numpy.datetime64('2005-02')

    Specifying just the month, but forcing a 'days' unit:

    >>> np.datetime64('2005-02', 'D')
    numpy.datetime64('2005-02-01')

    Using UTC "Zulu" time:

    >>> np.datetime64('2005-02-25T03:30Z')
    numpy.datetime64('2005-02-24T21:30-0600')

    ISO 8601 specifies to use the local time zone
    if none is explicitly given:

    >>> np.datetime64('2005-02-25T03:30')
    numpy.datetime64('2005-02-25T03:30-0600')

When creating an array of datetimes from a string, it is still possible
to automatically select the unit from the inputs, by using the
datetime type with generic units.

.. admonition:: Example

    >>> np.array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='datetime64')
    array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='datetime64[D]')

    >>> np.array(['2001-01-01T12:00', '2002-02-03T13:56:03.172'], dtype='datetime64')
    array(['2001-01-01T12:00:00.000-0600', '2002-02-03T13:56:03.172-0600'], dtype='datetime64[ms]')


The datetime type works with many common NumPy functions, for
example :func:`arange` can be used to generate ranges of dates.

.. admonition:: Example

    All the dates for one month:

    >>> np.arange('2005-02', '2005-03', dtype='datetime64[D]')
    array(['2005-02-01', '2005-02-02', '2005-02-03', '2005-02-04',
           '2005-02-05', '2005-02-06', '2005-02-07', '2005-02-08',
           '2005-02-09', '2005-02-10', '2005-02-11', '2005-02-12',
           '2005-02-13', '2005-02-14', '2005-02-15', '2005-02-16',
           '2005-02-17', '2005-02-18', '2005-02-19', '2005-02-20',
           '2005-02-21', '2005-02-22', '2005-02-23', '2005-02-24',
           '2005-02-25', '2005-02-26', '2005-02-27', '2005-02-28'],
           dtype='datetime64[D]')

The datetime object represents a single moment in time. If two
datetimes have different units, they may still be representing
the same moment of time, and converting from a bigger unit like
months to a smaller unit like days is considered a 'safe' cast
because the moment of time is still being represented exactly.

.. admonition:: Example

    >>> np.datetime64('2005') == np.datetime64('2005-01-01')
    True

    >>> np.datetime64('2010-03-14T15Z') == np.datetime64('2010-03-14T15:00:00.00Z')
    True

An important exception to this rule is between datetimes with
:ref:`date units <arrays.dtypes.dateunits>` and datetimes with
:ref:`time units <arrays.dtypes.timeunits>`. This is because this kind
of conversion generally requires a choice of timezone and
particular time of day on the given date.

.. admonition:: Example

    >>> np.datetime64('2003-12-25', 's')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Cannot parse "2003-12-25" as unit 's' using casting rule 'same_kind'

    >>> np.datetime64('2003-12-25') == np.datetime64('2003-12-25T00Z')
    False


Datetime and Timedelta Arithmetic
=================================

NumPy allows the subtraction of two Datetime values, an operation which
produces a number with a time unit. Because NumPy doesn't have a physical
quantities system in its core, the timedelta64 data type was created
to complement datetime64.

Datetimes and Timedeltas work together to provide ways for
simple datetime calculations.

.. admonition:: Example

    >>> np.datetime64('2009-01-01') - np.datetime64('2008-01-01')
    numpy.timedelta64(366,'D')

    >>> np.datetime64('2009') + np.timedelta64(20, 'D')
    numpy.datetime64('2009-01-21')

    >>> np.datetime64('2011-06-15T00:00') + np.timedelta64(12, 'h')
    numpy.datetime64('2011-06-15T12:00-0500')

    >>> np.timedelta64(1,'W') / np.timedelta64(1,'D')
    7.0

There are two Timedelta units ('Y', years and 'M', months) which are treated
specially, because how much time they represent changes depending
on when they are used. While a timedelta day unit is equivalent to
24 hours, there is no way to convert a month unit into days, because
different months have different numbers of days.

.. admonition:: Example

    >>> a = np.timedelta64(1, 'Y')

    >>> np.timedelta64(a, 'M')
    numpy.timedelta64(12,'M')

    >>> np.timedelta64(a, 'D')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Cannot cast NumPy timedelta64 scalar from metadata [Y] to [D] according to the rule 'same_kind'

Datetime Units
==============

The Datetime and Timedelta data types support a large number of time
units, as well as generic units which can be coerced into any of the
other units based on input data.

Datetimes are always stored based on POSIX time (though having a TAI
mode which allows for accounting of leap-seconds is proposed), with
a epoch of 1970-01-01T00:00Z. This means the supported dates are
always a symmetric interval around the epoch, called "time span" in the
table below.

The length of the span is the range of a 64-bit integer times the length
of the date or unit.  For example, the time span for 'W' (week) is exactly
7 times longer than the time span for 'D' (day), and the time span for
'D' (day) is exactly 24 times longer than the time span for 'h' (hour).

Here are the date units:

.. _arrays.dtypes.dateunits:

======== ================ ======================= ==========================
  Code       Meaning       Time span (relative)    Time span (absolute)
======== ================ ======================= ==========================
   Y       year             +/- 9.2e18 years        [9.2e18 BC, 9.2e18 AD]
   M       month            +/- 7.6e17 years        [7.6e17 BC, 7.6e17 AD]
   W       week             +/- 1.7e17 years        [1.7e17 BC, 1.7e17 AD]
   D       day              +/- 2.5e16 years        [2.5e16 BC, 2.5e16 AD]
======== ================ ======================= ==========================

And here are the time units:

.. _arrays.dtypes.timeunits:

======== ================ ======================= ==========================
  Code       Meaning       Time span (relative)    Time span (absolute)
======== ================ ======================= ==========================
   h       hour             +/- 1.0e15 years        [1.0e15 BC, 1.0e15 AD]
   m       minute           +/- 1.7e13 years        [1.7e13 BC, 1.7e13 AD]
   s       second           +/- 2.9e12 years        [ 2.9e9 BC,  2.9e9 AD]
   ms      millisecond      +/- 2.9e9 years         [ 2.9e6 BC,  2.9e6 AD]
   us      microsecond      +/- 2.9e6 years         [290301 BC, 294241 AD]
   ns      nanosecond       +/- 292 years           [  1678 AD,   2262 AD]
   ps      picosecond       +/- 106 days            [  1969 AD,   1970 AD]
   fs      femtosecond      +/- 2.6 hours           [  1969 AD,   1970 AD]
   as      attosecond       +/- 9.2 seconds         [  1969 AD,   1970 AD]
======== ================ ======================= ==========================

Business Day Functionality
==========================

To allow the datetime to be used in contexts where only certain days of
the week are valid, NumPy includes a set of "busday" (business day)
functions.

The default for busday functions is that the only valid days are Monday
through Friday (the usual business days).  The implementation is based on
a "weekmask" containing 7 Boolean flags to indicate valid days; custom
weekmasks are possible that specify other sets of valid days.

The "busday" functions can additionally check a list of "holiday" dates,
specific dates that are not valid days.

The function :func:`busday_offset` allows you to apply offsets
specified in business days to datetimes with a unit of 'D' (day).

.. admonition:: Example

    >>> np.busday_offset('2011-06-23', 1)
    numpy.datetime64('2011-06-24')

    >>> np.busday_offset('2011-06-23', 2)
    numpy.datetime64('2011-06-27')

When an input date falls on the weekend or a holiday,
:func:`busday_offset` first applies a rule to roll the
date to a valid business day, then applies the offset. The
default rule is 'raise', which simply raises an exception.
The rules most typically used are 'forward' and 'backward'.

.. admonition:: Example

    >>> np.busday_offset('2011-06-25', 2)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: Non-business day date in busday_offset

    >>> np.busday_offset('2011-06-25', 0, roll='forward')
    numpy.datetime64('2011-06-27')

    >>> np.busday_offset('2011-06-25', 2, roll='forward')
    numpy.datetime64('2011-06-29')

    >>> np.busday_offset('2011-06-25', 0, roll='backward')
    numpy.datetime64('2011-06-24')

    >>> np.busday_offset('2011-06-25', 2, roll='backward')
    numpy.datetime64('2011-06-28')

In some cases, an appropriate use of the roll and the offset
is necessary to get a desired answer.

.. admonition:: Example

    The first business day on or after a date:

    >>> np.busday_offset('2011-03-20', 0, roll='forward')
    numpy.datetime64('2011-03-21','D')
    >>> np.busday_offset('2011-03-22', 0, roll='forward')
    numpy.datetime64('2011-03-22','D')

    The first business day strictly after a date:

    >>> np.busday_offset('2011-03-20', 1, roll='backward')
    numpy.datetime64('2011-03-21','D')
    >>> np.busday_offset('2011-03-22', 1, roll='backward')
    numpy.datetime64('2011-03-23','D')

The function is also useful for computing some kinds of days
like holidays. In Canada and the U.S., Mother's day is on
the second Sunday in May, which can be computed with a custom
weekmask.

.. admonition:: Example

    >>> np.busday_offset('2012-05', 1, roll='forward', weekmask='Sun')
    numpy.datetime64('2012-05-13','D')

When performance is important for manipulating many business dates
with one particular choice of weekmask and holidays, there is
an object :class:`busdaycalendar` which stores the data necessary
in an optimized form.

np.is_busday():
```````````````
To test a datetime64 value to see if it is a valid day, use :func:`is_busday`.

.. admonition:: Example

    >>> np.is_busday(np.datetime64('2011-07-15'))  # a Friday
    True
    >>> np.is_busday(np.datetime64('2011-07-16')) # a Saturday
    False
    >>> np.is_busday(np.datetime64('2011-07-16'), weekmask="Sat Sun")
    True
    >>> a = np.arange(np.datetime64('2011-07-11'), np.datetime64('2011-07-18'))
    >>> np.is_busday(a)
    array([ True,  True,  True,  True,  True, False, False], dtype='bool')

np.busday_count():
``````````````````
To find how many valid days there are in a specified range of datetime64
dates, use :func:`busday_count`:

.. admonition:: Example

    >>> np.busday_count(np.datetime64('2011-07-11'), np.datetime64('2011-07-18'))
    5
    >>> np.busday_count(np.datetime64('2011-07-18'), np.datetime64('2011-07-11'))
    -5

If you have an array of datetime64 day values, and you want a count of
how many of them are valid dates, you can do this:

.. admonition:: Example

    >>> a = np.arange(np.datetime64('2011-07-11'), np.datetime64('2011-07-18'))
    >>> np.count_nonzero(np.is_busday(a))
    5



Custom Weekmasks
----------------

Here are several examples of custom weekmask values.  These examples
specify the "busday" default of Monday through Friday being valid days.

Some examples::

    # Positional sequences; positions are Monday through Sunday.
    # Length of the sequence must be exactly 7.
    weekmask = [1, 1, 1, 1, 1, 0, 0]
    # list or other sequence; 0 == invalid day, 1 == valid day
    weekmask = "1111100"
    # string '0' == invalid day, '1' == valid day

    # string abbreviations from this list: Mon Tue Wed Thu Fri Sat Sun
    weekmask = "Mon Tue Wed Thu Fri"
    # any amount of whitespace is allowed; abbreviations are case-sensitive.
    weekmask = "MonTue Wed  Thu\tFri"

Differences Between 1.6 and 1.7 Datetimes
=========================================

The NumPy 1.6 release includes a more primitive datetime data type
than 1.7. This section documents many of the changes that have taken
place.

String Parsing
``````````````

The datetime string parser in NumPy 1.6 is very liberal in what it accepts,
and silently allows invalid input without raising errors. The parser in
NumPy 1.7 is quite strict about only accepting ISO 8601 dates, with a few
convenience extensions. 1.6 always creates microsecond (us) units by
default, whereas 1.7 detects a unit based on the format of the string.
Here is a comparison.::

    # NumPy 1.6.1
    >>> np.datetime64('1979-03-22')
    1979-03-22 00:00:00
    # NumPy 1.7.0
    >>> np.datetime64('1979-03-22')
    numpy.datetime64('1979-03-22')

    # NumPy 1.6.1, unit default microseconds
    >>> np.datetime64('1979-03-22').dtype
    dtype('datetime64[us]')
    # NumPy 1.7.0, unit of days detected from string
    >>> np.datetime64('1979-03-22').dtype
    dtype('<M8[D]')

    # NumPy 1.6.1, ignores invalid part of string
    >>> np.datetime64('1979-03-2corruptedstring')
    1979-03-02 00:00:00
    # NumPy 1.7.0, raises error for invalid input
    >>> np.datetime64('1979-03-2corruptedstring')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: Error parsing datetime string "1979-03-2corruptedstring" at position 8

    # NumPy 1.6.1, 'nat' produces today's date
    >>> np.datetime64('nat')
    2012-04-30 00:00:00
    # NumPy 1.7.0, 'nat' produces not-a-time
    >>> np.datetime64('nat')
    numpy.datetime64('NaT')

    # NumPy 1.6.1, 'garbage' produces today's date
    >>> np.datetime64('garbage')
    2012-04-30 00:00:00
    # NumPy 1.7.0, 'garbage' raises an exception
    >>> np.datetime64('garbage')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: Error parsing datetime string "garbage" at position 0

    # NumPy 1.6.1, can't specify unit in scalar constructor
    >>> np.datetime64('1979-03-22T19:00', 'h')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: function takes at most 1 argument (2 given)
    # NumPy 1.7.0, unit in scalar constructor
    >>> np.datetime64('1979-03-22T19:00', 'h')
    numpy.datetime64('1979-03-22T19:00-0500','h')

    # NumPy 1.6.1, reads ISO 8601 strings w/o TZ as UTC
    >>> np.array(['1979-03-22T19:00'], dtype='M8[h]')
    array([1979-03-22 19:00:00], dtype=datetime64[h])
    # NumPy 1.7.0, reads ISO 8601 strings w/o TZ as local (ISO specifies this)
    >>> np.array(['1979-03-22T19:00'], dtype='M8[h]')
    array(['1979-03-22T19-0500'], dtype='datetime64[h]')

    # NumPy 1.6.1, doesn't parse all ISO 8601 strings correctly
    >>> np.array(['1979-03-22T12'], dtype='M8[h]')
    array([1979-03-22 00:00:00], dtype=datetime64[h])
    >>> np.array(['1979-03-22T12:00'], dtype='M8[h]')
    array([1979-03-22 12:00:00], dtype=datetime64[h])
    # NumPy 1.7.0, handles this case correctly
    >>> np.array(['1979-03-22T12'], dtype='M8[h]')
    array(['1979-03-22T12-0500'], dtype='datetime64[h]')
    >>> np.array(['1979-03-22T12:00'], dtype='M8[h]')
    array(['1979-03-22T12-0500'], dtype='datetime64[h]')

Unit Conversion
```````````````

The 1.6 implementation of datetime does not convert between units correctly.::

    # NumPy 1.6.1, the representation value is untouched
    >>> np.array(['1979-03-22'], dtype='M8[D]')
    array([1979-03-22 00:00:00], dtype=datetime64[D])
    >>> np.array(['1979-03-22'], dtype='M8[D]').astype('M8[M]')
    array([2250-08-01 00:00:00], dtype=datetime64[M])
    # NumPy 1.7.0, the representation is scaled accordingly
    >>> np.array(['1979-03-22'], dtype='M8[D]')
    array(['1979-03-22'], dtype='datetime64[D]')
    >>> np.array(['1979-03-22'], dtype='M8[D]').astype('M8[M]')
    array(['1979-03'], dtype='datetime64[M]')

Datetime Arithmetic
```````````````````

The 1.6 implementation of datetime only works correctly for a small subset of
arithmetic operations. Here we show some simple cases.::

    # NumPy 1.6.1, produces invalid results if units are incompatible
    >>> a = np.array(['1979-03-22T12'], dtype='M8[h]')
    >>> b = np.array([3*60], dtype='m8[m]')
    >>> a + b
    array([1970-01-01 00:00:00.080988], dtype=datetime64[us])
    # NumPy 1.7.0, promotes to higher-resolution unit
    >>> a = np.array(['1979-03-22T12'], dtype='M8[h]')
    >>> b = np.array([3*60], dtype='m8[m]')
    >>> a + b
    array(['1979-03-22T15:00-0500'], dtype='datetime64[m]')

    # NumPy 1.6.1, arithmetic works if everything is microseconds
    >>> a = np.array(['1979-03-22T12:00'], dtype='M8[us]')
    >>> b = np.array([3*60*60*1000000], dtype='m8[us]')
    >>> a + b
    array([1979-03-22 15:00:00], dtype=datetime64[us])
    # NumPy 1.7.0
    >>> a = np.array(['1979-03-22T12:00'], dtype='M8[us]')
    >>> b = np.array([3*60*60*1000000], dtype='m8[us]')
    >>> a + b
    array(['1979-03-22T15:00:00.000000-0500'], dtype='datetime64[us]')