주식데이터 Cleaning 및 Manipulating

roboticks

2021/02/10

Note: 키움 데이터 클리닝 예제입니다. 실제 데이터와 유사한 toy example로 간단하게 보여주고자 합니다. 전체코드는 Github에 올려져 있습니다.

유사한 데이터를 만들어 봅시다

##                     DT PRICE SIZE SYMBOL
## 1  2021-02-10 08:59:30  24.7  804      a
## 2  2021-02-10 08:59:30  25.1 1057      b
## 3  2021-02-10 08:59:30  24.6 1035      c
## 4  2021-02-10 08:59:30  25.8 1186      d
## 5  2021-02-10 08:59:30  25.2  500      e
## 6  2021-02-10 08:59:35  24.6  525      a
## 7  2021-02-10 08:59:35  25.2  302      b
## 8  2021-02-10 08:59:35  25.4  472      c
## 9  2021-02-10 08:59:35  25.3  248      d
## 10 2021-02-10 08:59:35  24.8 1259      e
## 11 2021-02-10 08:59:40  25.8  351      a
## 12 2021-02-10 08:59:40  25.2  863      b
## 13 2021-02-10 08:59:40  24.7  479      c
## 14 2021-02-10 08:59:40  23.9 1188      d
## 15 2021-02-10 08:59:40  25.6  269      e
## 16 2021-02-10 08:59:45  25.0   80      a
## 17 2021-02-10 08:59:45  25.0  123      b
## 18 2021-02-10 08:59:45  25.5   99      c
## 19 2021-02-10 08:59:45  25.4  835      d
## 20 2021-02-10 08:59:45  25.3 1858      e
## 21 2021-02-10 08:59:50  25.5  498      a
## 22 2021-02-10 08:59:50  25.4 1270      b
## 23 2021-02-10 08:59:50  25.0  937      c
## 24 2021-02-10 08:59:50  24.0  533      d
## 25 2021-02-10 08:59:50  25.3 1617      e
##  [ reached 'max' / getOption("max.print") -- omitted 9975 rows ]

특정시간 뽑아내기

##                     PRICE SIZE
## 2021-02-10 09:00:00  25.7  178
## 2021-02-10 09:00:00  24.9  208
## 2021-02-10 09:00:00  25.2  803
## 2021-02-10 09:00:00  25.0 1595
## 2021-02-10 09:00:00  24.3 1600
## 2021-02-10 09:00:05  24.8  499
## 2021-02-10 09:00:05  24.8  368
## 2021-02-10 09:00:05  25.0  411
## 2021-02-10 09:00:05  25.6  927
## 2021-02-10 09:00:05  25.4  865
## 2021-02-10 09:00:10  24.9  722
## 2021-02-10 09:00:10  24.9  271
## 2021-02-10 09:00:10  25.3  314
## 2021-02-10 09:00:10  25.3  756
## 2021-02-10 09:00:10  24.7  189
## 2021-02-10 09:00:15  24.6  377
## 2021-02-10 09:00:15  25.2  408
## 2021-02-10 09:00:15  25.4  266
## 2021-02-10 09:00:15  24.9   86
## 2021-02-10 09:00:15  25.4  199
## 2021-02-10 09:00:20  25.2  561
## 2021-02-10 09:00:20  24.7  447
## 2021-02-10 09:00:20  25.2  268
## 2021-02-10 09:00:20  24.4  314
## 2021-02-10 09:00:20  25.7  611
## 2021-02-10 09:00:25  26.0  116
## 2021-02-10 09:00:25  24.8  804
## 2021-02-10 09:00:25  24.5  201
## 2021-02-10 09:00:25  25.3  339
## 2021-02-10 09:00:25  24.9  338
## 2021-02-10 09:00:30  26.2  709
## 2021-02-10 09:00:30  25.0  523
## 2021-02-10 09:00:30  25.3  354
## 2021-02-10 09:00:30  25.0  778
## 2021-02-10 09:00:30  24.6  797
## 2021-02-10 09:00:35  25.1  905
## 2021-02-10 09:00:35  24.1  961
## 2021-02-10 09:00:35  25.7  257
## 2021-02-10 09:00:35  25.1  411
## 2021-02-10 09:00:35  26.1  395
## 2021-02-10 09:00:40  25.2 1082
## 2021-02-10 09:00:40  24.6   52
## 2021-02-10 09:00:40  25.3 1255
## 2021-02-10 09:00:40  24.5   22
## 2021-02-10 09:00:40  24.4 1112
## 2021-02-10 09:00:45  25.1  358
## 2021-02-10 09:00:45  24.8 1933
## 2021-02-10 09:00:45  25.0  113
## 2021-02-10 09:00:45  25.0  832
## 2021-02-10 09:00:45  24.7  982
##  [ reached getOption("max.print") -- omitted 15 rows ]

필요한 틱커만 뽑아내기 (a,b,c)

##                     DT PRICE SIZE SYMBOL
## 1  2021-02-10 08:59:30  24.7  804      a
## 2  2021-02-10 08:59:30  25.1 1057      b
## 3  2021-02-10 08:59:30  24.6 1035      c
## 4  2021-02-10 08:59:30  25.8 1186      d
## 5  2021-02-10 08:59:35  24.6  525      a
## 6  2021-02-10 08:59:35  25.2  302      b
## 7  2021-02-10 08:59:35  25.4  472      c
## 8  2021-02-10 08:59:35  25.3  248      d
## 9  2021-02-10 08:59:40  25.8  351      a
## 10 2021-02-10 08:59:40  25.2  863      b
## 11 2021-02-10 08:59:40  24.7  479      c
## 12 2021-02-10 08:59:40  23.9 1188      d
## 13 2021-02-10 08:59:45  25.0   80      a
## 14 2021-02-10 08:59:45  25.0  123      b
## 15 2021-02-10 08:59:45  25.5   99      c
## 16 2021-02-10 08:59:45  25.4  835      d
## 17 2021-02-10 08:59:50  25.5  498      a
## 18 2021-02-10 08:59:50  25.4 1270      b
## 19 2021-02-10 08:59:50  25.0  937      c
## 20 2021-02-10 08:59:50  24.0  533      d
## 21 2021-02-10 08:59:55  25.0  317      a
## 22 2021-02-10 08:59:55  24.9  420      b
## 23 2021-02-10 08:59:55  24.3  181      c
## 24 2021-02-10 08:59:55  24.8  802      d
## 25 2021-02-10 09:00:00  25.7  178      a
##  [ reached 'max' / getOption("max.print") -- omitted 7975 rows ]

티커로 스플릿 후 리스트 만들기

## $a
##                     DT PRICE SIZE SYMBOL
## 1  2021-02-10 08:59:30  24.7  804      a
## 5  2021-02-10 08:59:35  24.6  525      a
## 9  2021-02-10 08:59:40  25.8  351      a
## 13 2021-02-10 08:59:45  25.0   80      a
## 17 2021-02-10 08:59:50  25.5  498      a
## 21 2021-02-10 08:59:55  25.0  317      a
## 25 2021-02-10 09:00:00  25.7  178      a
## 29 2021-02-10 09:00:05  24.8  499      a
## 33 2021-02-10 09:00:10  24.9  722      a
## 37 2021-02-10 09:00:15  24.6  377      a
## 41 2021-02-10 09:00:20  25.2  561      a
## 45 2021-02-10 09:00:25  26.0  116      a
## 49 2021-02-10 09:00:30  26.2  709      a
## 53 2021-02-10 09:00:35  25.1  905      a
## 57 2021-02-10 09:00:40  25.2 1082      a
## 61 2021-02-10 09:00:45  25.1  358      a
## 65 2021-02-10 09:00:50  24.7 1041      a
## 69 2021-02-10 09:00:55  25.2 1044      a
## 73 2021-02-10 09:01:00  24.7 1036      a
## 77 2021-02-10 09:01:05  25.3  209      a
## 81 2021-02-10 09:01:10  24.7  505      a
## 85 2021-02-10 09:01:15  25.9  570      a
## 89 2021-02-10 09:01:20  24.7 1431      a
## 93 2021-02-10 09:01:25  24.8  739      a
## 97 2021-02-10 09:01:30  24.7  305      a
##  [ reached 'max' / getOption("max.print") -- omitted 1975 rows ]
## 
## $b
##                     DT PRICE SIZE SYMBOL
## 2  2021-02-10 08:59:30  25.1 1057      b
## 6  2021-02-10 08:59:35  25.2  302      b
## 10 2021-02-10 08:59:40  25.2  863      b
## 14 2021-02-10 08:59:45  25.0  123      b
## 18 2021-02-10 08:59:50  25.4 1270      b
## 22 2021-02-10 08:59:55  24.9  420      b
## 26 2021-02-10 09:00:00  24.9  208      b
## 30 2021-02-10 09:00:05  24.8  368      b
## 34 2021-02-10 09:00:10  24.9  271      b
## 38 2021-02-10 09:00:15  25.2  408      b
## 42 2021-02-10 09:00:20  24.7  447      b
## 46 2021-02-10 09:00:25  24.8  804      b
## 50 2021-02-10 09:00:30  25.0  523      b
## 54 2021-02-10 09:00:35  24.1  961      b
## 58 2021-02-10 09:00:40  24.6   52      b
## 62 2021-02-10 09:00:45  24.8 1933      b
## 66 2021-02-10 09:00:50  24.9   56      b
## 70 2021-02-10 09:00:55  25.5 1267      b
## 74 2021-02-10 09:01:00  25.6  187      b
## 78 2021-02-10 09:01:05  24.4  311      b
## 82 2021-02-10 09:01:10  25.0 1295      b
## 86 2021-02-10 09:01:15  25.4  489      b
## 90 2021-02-10 09:01:20  24.8  396      b
## 94 2021-02-10 09:01:25  24.8 2079      b
## 98 2021-02-10 09:01:30  25.7  645      b
##  [ reached 'max' / getOption("max.print") -- omitted 1975 rows ]
## 
## $c
##                     DT PRICE SIZE SYMBOL
## 3  2021-02-10 08:59:30  24.6 1035      c
## 7  2021-02-10 08:59:35  25.4  472      c
## 11 2021-02-10 08:59:40  24.7  479      c
## 15 2021-02-10 08:59:45  25.5   99      c
## 19 2021-02-10 08:59:50  25.0  937      c
## 23 2021-02-10 08:59:55  24.3  181      c
## 27 2021-02-10 09:00:00  25.2  803      c
## 31 2021-02-10 09:00:05  25.0  411      c
## 35 2021-02-10 09:00:10  25.3  314      c
## 39 2021-02-10 09:00:15  25.4  266      c
## 43 2021-02-10 09:00:20  25.2  268      c
## 47 2021-02-10 09:00:25  24.5  201      c
## 51 2021-02-10 09:00:30  25.3  354      c
## 55 2021-02-10 09:00:35  25.7  257      c
## 59 2021-02-10 09:00:40  25.3 1255      c
## 63 2021-02-10 09:00:45  25.0  113      c
## 67 2021-02-10 09:00:50  25.6  176      c
## 71 2021-02-10 09:00:55  24.8  746      c
## 75 2021-02-10 09:01:00  25.6   16      c
## 79 2021-02-10 09:01:05  24.7  105      c
## 83 2021-02-10 09:01:10  24.5  153      c
## 87 2021-02-10 09:01:15  25.5  251      c
## 91 2021-02-10 09:01:20  25.7  475      c
## 95 2021-02-10 09:01:25  24.9  377      c
## 99 2021-02-10 09:01:30  24.9 1164      c
##  [ reached 'max' / getOption("max.print") -- omitted 1975 rows ]
## 
## $d
##                      DT PRICE SIZE SYMBOL
## 4   2021-02-10 08:59:30  25.8 1186      d
## 8   2021-02-10 08:59:35  25.3  248      d
## 12  2021-02-10 08:59:40  23.9 1188      d
## 16  2021-02-10 08:59:45  25.4  835      d
## 20  2021-02-10 08:59:50  24.0  533      d
## 24  2021-02-10 08:59:55  24.8  802      d
## 28  2021-02-10 09:00:00  25.0 1595      d
## 32  2021-02-10 09:00:05  25.6  927      d
## 36  2021-02-10 09:00:10  25.3  756      d
## 40  2021-02-10 09:00:15  24.9   86      d
## 44  2021-02-10 09:00:20  24.4  314      d
## 48  2021-02-10 09:00:25  25.3  339      d
## 52  2021-02-10 09:00:30  25.0  778      d
## 56  2021-02-10 09:00:35  25.1  411      d
## 60  2021-02-10 09:00:40  24.5   22      d
## 64  2021-02-10 09:00:45  25.0  832      d
## 68  2021-02-10 09:00:50  24.2  911      d
## 72  2021-02-10 09:00:55  25.2  215      d
## 76  2021-02-10 09:01:00  25.4 1122      d
## 80  2021-02-10 09:01:05  24.4 1096      d
## 84  2021-02-10 09:01:10  25.1  312      d
## 88  2021-02-10 09:01:15  25.2  261      d
## 92  2021-02-10 09:01:20  24.7 1789      d
## 96  2021-02-10 09:01:25  25.2 2289      d
## 100 2021-02-10 09:01:30  24.9  935      d
##  [ reached 'max' / getOption("max.print") -- omitted 1975 rows ]

각 티커별 추가 변수 생성

## $a
##                     DT PRICE SIZE SYMBOL RETURN CUM_SIZE
## 1  2021-02-10 08:59:30  24.7  804      a     NA      804
## 2  2021-02-10 08:59:35  24.6  525      a   0.00     1329
## 3  2021-02-10 08:59:40  25.8  351      a   0.05     1680
## 4  2021-02-10 08:59:45  25.0   80      a  -0.03     1760
## 5  2021-02-10 08:59:50  25.5  498      a   0.02     2258
## 6  2021-02-10 08:59:55  25.0  317      a  -0.02     2575
## 7  2021-02-10 09:00:00  25.7  178      a   0.03     2753
## 8  2021-02-10 09:00:05  24.8  499      a  -0.04     3252
## 9  2021-02-10 09:00:10  24.9  722      a   0.00     3974
## 10 2021-02-10 09:00:15  24.6  377      a  -0.01     4351
## 11 2021-02-10 09:00:20  25.2  561      a   0.02     4912
## 12 2021-02-10 09:00:25  26.0  116      a   0.03     5028
## 13 2021-02-10 09:00:30  26.2  709      a   0.01     5737
## 14 2021-02-10 09:00:35  25.1  905      a  -0.04     6642
## 15 2021-02-10 09:00:40  25.2 1082      a   0.00     7724
## 16 2021-02-10 09:00:45  25.1  358      a   0.00     8082
##  [ reached 'max' / getOption("max.print") -- omitted 1984 rows ]
## 
## $b
##                     DT PRICE SIZE SYMBOL RETURN CUM_SIZE
## 1  2021-02-10 08:59:30  25.1 1057      b     NA     1057
## 2  2021-02-10 08:59:35  25.2  302      b   0.00     1359
## 3  2021-02-10 08:59:40  25.2  863      b   0.00     2222
## 4  2021-02-10 08:59:45  25.0  123      b  -0.01     2345
## 5  2021-02-10 08:59:50  25.4 1270      b   0.02     3615
## 6  2021-02-10 08:59:55  24.9  420      b  -0.02     4035
## 7  2021-02-10 09:00:00  24.9  208      b   0.00     4243
## 8  2021-02-10 09:00:05  24.8  368      b   0.00     4611
## 9  2021-02-10 09:00:10  24.9  271      b   0.00     4882
## 10 2021-02-10 09:00:15  25.2  408      b   0.01     5290
## 11 2021-02-10 09:00:20  24.7  447      b  -0.02     5737
## 12 2021-02-10 09:00:25  24.8  804      b   0.00     6541
## 13 2021-02-10 09:00:30  25.0  523      b   0.01     7064
## 14 2021-02-10 09:00:35  24.1  961      b  -0.04     8025
## 15 2021-02-10 09:00:40  24.6   52      b   0.02     8077
## 16 2021-02-10 09:00:45  24.8 1933      b   0.01    10010
##  [ reached 'max' / getOption("max.print") -- omitted 1984 rows ]
## 
## $c
##                     DT PRICE SIZE SYMBOL RETURN CUM_SIZE
## 1  2021-02-10 08:59:30  24.6 1035      c     NA     1035
## 2  2021-02-10 08:59:35  25.4  472      c   0.03     1507
## 3  2021-02-10 08:59:40  24.7  479      c  -0.03     1986
## 4  2021-02-10 08:59:45  25.5   99      c   0.03     2085
## 5  2021-02-10 08:59:50  25.0  937      c  -0.02     3022
## 6  2021-02-10 08:59:55  24.3  181      c  -0.03     3203
## 7  2021-02-10 09:00:00  25.2  803      c   0.04     4006
## 8  2021-02-10 09:00:05  25.0  411      c  -0.01     4417
## 9  2021-02-10 09:00:10  25.3  314      c   0.01     4731
## 10 2021-02-10 09:00:15  25.4  266      c   0.00     4997
## 11 2021-02-10 09:00:20  25.2  268      c  -0.01     5265
## 12 2021-02-10 09:00:25  24.5  201      c  -0.03     5466
## 13 2021-02-10 09:00:30  25.3  354      c   0.03     5820
## 14 2021-02-10 09:00:35  25.7  257      c   0.02     6077
## 15 2021-02-10 09:00:40  25.3 1255      c  -0.02     7332
## 16 2021-02-10 09:00:45  25.0  113      c  -0.01     7445
##  [ reached 'max' / getOption("max.print") -- omitted 1984 rows ]
## 
## $d
##                     DT PRICE SIZE SYMBOL RETURN CUM_SIZE
## 1  2021-02-10 08:59:30  25.8 1186      d     NA     1186
## 2  2021-02-10 08:59:35  25.3  248      d  -0.02     1434
## 3  2021-02-10 08:59:40  23.9 1188      d  -0.06     2622
## 4  2021-02-10 08:59:45  25.4  835      d   0.06     3457
## 5  2021-02-10 08:59:50  24.0  533      d  -0.06     3990
## 6  2021-02-10 08:59:55  24.8  802      d   0.03     4792
## 7  2021-02-10 09:00:00  25.0 1595      d   0.01     6387
## 8  2021-02-10 09:00:05  25.6  927      d   0.02     7314
## 9  2021-02-10 09:00:10  25.3  756      d  -0.01     8070
## 10 2021-02-10 09:00:15  24.9   86      d  -0.02     8156
## 11 2021-02-10 09:00:20  24.4  314      d  -0.02     8470
## 12 2021-02-10 09:00:25  25.3  339      d   0.04     8809
## 13 2021-02-10 09:00:30  25.0  778      d  -0.01     9587
## 14 2021-02-10 09:00:35  25.1  411      d   0.00     9998
## 15 2021-02-10 09:00:40  24.5   22      d  -0.02    10020
## 16 2021-02-10 09:00:45  25.0  832      d   0.02    10852
##  [ reached 'max' / getOption("max.print") -- omitted 1984 rows ]

특정 시간(e.g. 90초) 에 맞춰 OHLC-V 만들기

## $a
##                       DT SYMBOL OPEN HIGH  LOW CLOSE VOLUME
##   1: 2021-02-10 09:00:00      a 24.7 25.8 24.6  25.0   2575
##   2: 2021-02-10 09:01:00      a 25.7 26.2 24.6  25.2   7592
##   3: 2021-02-10 09:02:00      a 24.7 25.9 24.0  24.6   7035
##   4: 2021-02-10 09:03:00      a 25.2 26.2 24.4  26.2   9750
##   5: 2021-02-10 09:04:00      a 24.9 25.9 24.0  25.1   9643
##  ---                                                       
## 164: 2021-02-10 11:43:00      a 27.4 28.3 24.4  27.0   4823
## 165: 2021-02-10 11:44:00      a 26.5 27.2 25.2  27.1   4834
## 166: 2021-02-10 11:45:00      a 25.9 27.5 24.5  25.4   3708
## 167: 2021-02-10 11:46:00      a 25.9 26.8 24.7  26.2   2995
## 168: 2021-02-10 11:47:00      a 27.0 27.0 27.0  27.0    425
## 
## $b
##                       DT SYMBOL OPEN HIGH  LOW CLOSE VOLUME
##   1: 2021-02-10 09:00:00      b 25.1 25.4 24.9  24.9   4035
##   2: 2021-02-10 09:01:00      b 24.9 25.5 24.1  25.5   7298
##   3: 2021-02-10 09:02:00      b 25.6 26.0 24.4  26.0  10203
##   4: 2021-02-10 09:03:00      b 25.0 25.8 24.9  25.3  10281
##   5: 2021-02-10 09:04:00      b 25.2 25.5 23.6  24.8   9771
##  ---                                                       
## 164: 2021-02-10 11:43:00      b 25.0 27.6 22.8  27.6   6301
## 165: 2021-02-10 11:44:00      b 25.6 27.9 24.9  26.2   6954
## 166: 2021-02-10 11:45:00      b 26.5 28.3 24.5  25.5   5377
## 167: 2021-02-10 11:46:00      b 27.0 27.0 24.1  25.9   4820
## 168: 2021-02-10 11:47:00      b 26.0 26.4 26.0  26.4    640
## 
## $c
##                       DT SYMBOL OPEN HIGH  LOW CLOSE VOLUME
##   1: 2021-02-10 09:00:00      c 24.6 25.5 24.3  24.3   3203
##   2: 2021-02-10 09:01:00      c 25.2 25.7 24.5  24.8   5164
##   3: 2021-02-10 09:02:00      c 25.6 25.7 24.2  25.0   5397
##   4: 2021-02-10 09:03:00      c 24.8 26.0 24.3  25.3   9360
##   5: 2021-02-10 09:04:00      c 24.8 25.3 24.2  24.4  11798
##  ---                                                       
## 164: 2021-02-10 11:43:00      c 25.5 27.4 24.4  25.9   3796
## 165: 2021-02-10 11:44:00      c 23.9 27.9 23.9  26.5   4669
## 166: 2021-02-10 11:45:00      c 25.8 27.1 24.6  26.6   3722
## 167: 2021-02-10 11:46:00      c 25.9 27.5 24.2  25.0   6649
## 168: 2021-02-10 11:47:00      c 25.5 26.5 25.5  26.5    269
## 
## $d
##                       DT SYMBOL OPEN HIGH  LOW CLOSE VOLUME
##   1: 2021-02-10 09:00:00      d 25.8 25.8 23.9  24.8   4792
##   2: 2021-02-10 09:01:00      d 25.0 25.6 24.2  25.2   7186
##   3: 2021-02-10 09:02:00      d 25.4 25.4 24.2  24.4  11386
##   4: 2021-02-10 09:03:00      d 24.5 25.5 24.3  25.0   6603
##   5: 2021-02-10 09:04:00      d 24.3 25.5 24.3  24.8   5754
##  ---                                                       
## 164: 2021-02-10 11:43:00      d 27.1 27.4 23.7  25.0   5292
## 165: 2021-02-10 11:44:00      d 25.4 28.1 24.2  25.5   5210
## 166: 2021-02-10 11:45:00      d 25.5 27.5 24.2  27.5   5409
## 167: 2021-02-10 11:46:00      d 26.0 27.7 25.0  25.0   3888
## 168: 2021-02-10 11:47:00      d 26.0 26.9 26.0  26.9   1042

특정 시간(e.g. 60초)에 맞춰 Aggreagte 하기

## $a
##                     PRICE SIZE RETURN CUM_SIZE
## 2021-02-10 08:59:00  24.7  804     NA      804
## 2021-02-10 09:00:00  25.7  178   0.03     2753
## 2021-02-10 09:01:00  24.7 1036  -0.02    11203
## 2021-02-10 09:02:00  25.2 1894   0.02    19096
## 2021-02-10 09:03:00  24.9 2347  -0.05    29299
## 2021-02-10 09:04:00  24.9 1479  -0.01    38074
## 2021-02-10 09:05:00  24.8 1318  -0.02    50177
## 2021-02-10 09:06:00  25.7  990   0.03    60325
## 2021-02-10 09:07:00  25.5 2127   0.09    74214
## 2021-02-10 09:08:00  23.7 2959  -0.08    96439
## 2021-02-10 09:09:00  24.5  202  -0.04   113258
## 2021-02-10 09:10:00  24.4 1589   0.00   130272
## 2021-02-10 09:11:00  25.5 2578   0.03   146698
## 2021-02-10 09:12:00  24.3 2010   0.00   169807
## 2021-02-10 09:13:00  25.2 4726   0.01   191094
## 2021-02-10 09:14:00  25.9 2375   0.01   213379
## 2021-02-10 09:15:00  25.4 2391   0.04   242385
## 2021-02-10 09:16:00  24.8 4998  -0.01   266521
## 2021-02-10 09:17:00  25.0 2062  -0.02   291544
## 2021-02-10 09:18:00  25.0 1098  -0.01   317701
## 2021-02-10 09:19:00  24.8 4024   0.02   349159
## 2021-02-10 09:20:00  24.9 1285  -0.01   368912
## 2021-02-10 09:21:00  25.6  623   0.03   381319
## 2021-02-10 09:22:00  24.2  170  -0.05   394076
## 2021-02-10 09:23:00  25.5  356   0.04   404279
##  [ reached getOption("max.print") -- omitted 143 rows ]
## 
## $b
##                     PRICE SIZE RETURN CUM_SIZE
## 2021-02-10 08:59:00  25.1 1057     NA     1057
## 2021-02-10 09:00:00  24.9  208   0.00     4243
## 2021-02-10 09:01:00  25.6  187   0.00    11520
## 2021-02-10 09:02:00  25.0  510  -0.04    22046
## 2021-02-10 09:03:00  25.2  279   0.00    32096
## 2021-02-10 09:04:00  25.2  833   0.02    42421
## 2021-02-10 09:05:00  24.2  909  -0.02    49380
## 2021-02-10 09:06:00  25.3 1450   0.06    58576
## 2021-02-10 09:07:00  24.7 3334   0.02    78809
## 2021-02-10 09:08:00  24.5 2205   0.01    93411
## 2021-02-10 09:09:00  25.4 2348   0.02   112692
## 2021-02-10 09:10:00  24.7 1064  -0.03   130752
## 2021-02-10 09:11:00  25.0   78   0.01   144698
## 2021-02-10 09:12:00  24.6 1763  -0.04   160840
## 2021-02-10 09:13:00  25.4  195   0.02   182910
## 2021-02-10 09:14:00  25.2 1015   0.01   203789
## 2021-02-10 09:15:00  25.5 1032   0.04   226962
## 2021-02-10 09:16:00  24.7 2248   0.00   257290
## 2021-02-10 09:17:00  25.1  677  -0.01   288022
## 2021-02-10 09:18:00  25.4 1366   0.04   318782
## 2021-02-10 09:19:00  24.7  978  -0.03   351140
## 2021-02-10 09:20:00  25.4 2039   0.03   377258
## 2021-02-10 09:21:00  25.0  414   0.00   384669
## 2021-02-10 09:22:00  25.4 1151   0.04   392737
## 2021-02-10 09:23:00  25.1  563   0.00   399504
##  [ reached getOption("max.print") -- omitted 143 rows ]
## 
## $c
##                     PRICE SIZE RETURN CUM_SIZE
## 2021-02-10 08:59:00  24.6 1035     NA     1035
## 2021-02-10 09:00:00  25.2  803   0.04     4006
## 2021-02-10 09:01:00  25.6   16   0.03     8383
## 2021-02-10 09:02:00  24.8  778  -0.01    14542
## 2021-02-10 09:03:00  24.8 2771  -0.02    25895
## 2021-02-10 09:04:00  24.6 1072   0.01    35994
## 2021-02-10 09:05:00  24.8  238  -0.01    45551
## 2021-02-10 09:06:00  25.5  446   0.01    57146
## 2021-02-10 09:07:00  24.6 1393  -0.01    74476
## 2021-02-10 09:08:00  24.5 2400  -0.02    94674
## 2021-02-10 09:09:00  25.4 1119   0.02   111750
## 2021-02-10 09:10:00  24.6  130  -0.01   125319
## 2021-02-10 09:11:00  24.0  961  -0.06   137938
## 2021-02-10 09:12:00  24.4 1173  -0.01   159068
## 2021-02-10 09:13:00  25.3 4293  -0.01   178741
## 2021-02-10 09:14:00  24.7 2543   0.02   209177
## 2021-02-10 09:15:00  24.9 3494   0.00   231152
## 2021-02-10 09:16:00  25.3 3312   0.01   255988
## 2021-02-10 09:17:00  24.5 1102  -0.02   282345
## 2021-02-10 09:18:00  25.2 5961   0.03   315542
## 2021-02-10 09:19:00  25.2  433   0.02   343622
## 2021-02-10 09:20:00  25.2  143   0.01   359705
## 2021-02-10 09:21:00  25.2 1255  -0.02   370520
## 2021-02-10 09:22:00  25.0  994   0.02   380886
## 2021-02-10 09:23:00  24.8 1050   0.02   389862
##  [ reached getOption("max.print") -- omitted 143 rows ]
## 
## $d
##                     PRICE SIZE RETURN CUM_SIZE
## 2021-02-10 08:59:00  25.8 1186     NA     1186
## 2021-02-10 09:00:00  25.0 1595   0.01     6387
## 2021-02-10 09:01:00  25.4 1122   0.01    13100
## 2021-02-10 09:02:00  24.5  736   0.00    24100
## 2021-02-10 09:03:00  24.3  154  -0.03    30121
## 2021-02-10 09:04:00  26.3 1640   0.06    37361
## 2021-02-10 09:05:00  25.7 1134   0.04    46682
## 2021-02-10 09:06:00  25.2  679   0.05    57228
## 2021-02-10 09:07:00  24.2 4460  -0.04    74221
## 2021-02-10 09:08:00  25.0  904  -0.02    96125
## 2021-02-10 09:09:00  24.8  132   0.00   106695
## 2021-02-10 09:10:00  25.0  949   0.01   128332
## 2021-02-10 09:11:00  24.6  654  -0.05   145331
## 2021-02-10 09:12:00  25.3  457  -0.02   161871
## 2021-02-10 09:13:00  23.8 1924  -0.06   182647
## 2021-02-10 09:14:00  24.8 1145  -0.01   207248
## 2021-02-10 09:15:00  23.8 1670  -0.07   236469
## 2021-02-10 09:16:00  25.5 5254   0.00   281434
## 2021-02-10 09:17:00  26.2 3032   0.04   313128
## 2021-02-10 09:18:00  25.3 4186   0.01   348439
## 2021-02-10 09:19:00  25.8 1823   0.06   377842
## 2021-02-10 09:20:00  24.7  265  -0.06   389858
## 2021-02-10 09:21:00  25.6  394   0.00   401116
## 2021-02-10 09:22:00  25.3  947   0.01   413790
## 2021-02-10 09:23:00  25.3 1892   0.05   422121
##  [ reached getOption("max.print") -- omitted 143 rows ]

OHLCV랑 Agg 합치기

## $a
##                     PRICE SIZE RETURN CUM_SIZE OPEN HIGH  LOW CLOSE VOLUME
## 2021-02-10 08:59:00  24.7  804     NA      804   NA   NA   NA    NA     NA
## 2021-02-10 09:00:00  25.7  178   0.03     2753 24.7 25.8 24.6  25.0   2575
## 2021-02-10 09:01:00  24.7 1036  -0.02    11203 25.7 26.2 24.6  25.2   7592
## 2021-02-10 09:02:00  25.2 1894   0.02    19096 24.7 25.9 24.0  24.6   7035
## 2021-02-10 09:03:00  24.9 2347  -0.05    29299 25.2 26.2 24.4  26.2   9750
## 2021-02-10 09:04:00  24.9 1479  -0.01    38074 24.9 25.9 24.0  25.1   9643
## 2021-02-10 09:05:00  24.8 1318  -0.02    50177 24.9 25.8 24.5  25.2  12264
## 2021-02-10 09:06:00  25.7  990   0.03    60325 24.8 25.7 23.7  25.0  10476
## 2021-02-10 09:07:00  25.5 2127   0.09    74214 25.7 25.8 23.5  23.5  12752
## 2021-02-10 09:08:00  23.7 2959  -0.08    96439 25.5 26.2 24.3  25.8  21393
## 2021-02-10 09:09:00  24.5  202  -0.04   113258 23.7 25.5 23.7  25.5  19576
##  [ reached getOption("max.print") -- omitted 158 rows ]
## 
## $b
##                     PRICE SIZE RETURN CUM_SIZE OPEN HIGH  LOW CLOSE VOLUME
## 2021-02-10 08:59:00  25.1 1057     NA     1057   NA   NA   NA    NA     NA
## 2021-02-10 09:00:00  24.9  208   0.00     4243 25.1 25.4 24.9  24.9   4035
## 2021-02-10 09:01:00  25.6  187   0.00    11520 24.9 25.5 24.1  25.5   7298
## 2021-02-10 09:02:00  25.0  510  -0.04    22046 25.6 26.0 24.4  26.0  10203
## 2021-02-10 09:03:00  25.2  279   0.00    32096 25.0 25.8 24.9  25.3  10281
## 2021-02-10 09:04:00  25.2  833   0.02    42421 25.2 25.5 23.6  24.8   9771
## 2021-02-10 09:05:00  24.2  909  -0.02    49380 25.2 25.7 23.9  24.8   6883
## 2021-02-10 09:06:00  25.3 1450   0.06    58576 24.2 25.7 23.9  23.9   8655
## 2021-02-10 09:07:00  24.7 3334   0.02    78809 25.3 25.9 24.2  24.3  18349
## 2021-02-10 09:08:00  24.5 2205   0.01    93411 24.7 26.1 24.2  24.3  15731
## 2021-02-10 09:09:00  25.4 2348   0.02   112692 24.5 25.6 23.8  24.9  19138
##  [ reached getOption("max.print") -- omitted 158 rows ]
## 
## $c
##                     PRICE SIZE RETURN CUM_SIZE OPEN HIGH  LOW CLOSE VOLUME
## 2021-02-10 08:59:00  24.6 1035     NA     1035   NA   NA   NA    NA     NA
## 2021-02-10 09:00:00  25.2  803   0.04     4006 24.6 25.5 24.3  24.3   3203
## 2021-02-10 09:01:00  25.6   16   0.03     8383 25.2 25.7 24.5  24.8   5164
## 2021-02-10 09:02:00  24.8  778  -0.01    14542 25.6 25.7 24.2  25.0   5397
## 2021-02-10 09:03:00  24.8 2771  -0.02    25895 24.8 26.0 24.3  25.3   9360
## 2021-02-10 09:04:00  24.6 1072   0.01    35994 24.8 25.3 24.2  24.4  11798
## 2021-02-10 09:05:00  24.8  238  -0.01    45551 24.6 26.0 24.6  25.1  10391
## 2021-02-10 09:06:00  25.5  446   0.01    57146 24.8 25.3 24.5  25.2  11387
## 2021-02-10 09:07:00  24.6 1393  -0.01    74476 25.5 25.6 24.2  24.8  16383
## 2021-02-10 09:08:00  24.5 2400  -0.02    94674 24.6 25.6 24.0  24.9  19191
## 2021-02-10 09:09:00  25.4 1119   0.02   111750 24.5 25.6 24.4  24.9  18357
##  [ reached getOption("max.print") -- omitted 158 rows ]
## 
## $d
##                     PRICE SIZE RETURN CUM_SIZE OPEN HIGH  LOW CLOSE VOLUME
## 2021-02-10 08:59:00  25.8 1186     NA     1186   NA   NA   NA    NA     NA
## 2021-02-10 09:00:00  25.0 1595   0.01     6387 25.8 25.8 23.9  24.8   4792
## 2021-02-10 09:01:00  25.4 1122   0.01    13100 25.0 25.6 24.2  25.2   7186
## 2021-02-10 09:02:00  24.5  736   0.00    24100 25.4 25.4 24.2  24.4  11386
## 2021-02-10 09:03:00  24.3  154  -0.03    30121 24.5 25.5 24.3  25.0   6603
## 2021-02-10 09:04:00  26.3 1640   0.06    37361 24.3 25.5 24.3  24.8   5754
## 2021-02-10 09:05:00  25.7 1134   0.04    46682 26.3 26.3 24.1  24.6   9827
## 2021-02-10 09:06:00  25.2  679   0.05    57228 25.7 25.7 24.0  24.0  11001
## 2021-02-10 09:07:00  24.2 4460  -0.04    74221 25.2 25.7 24.1  25.1  13212
## 2021-02-10 09:08:00  25.0  904  -0.02    96125 24.2 25.7 24.0  25.6  25460
## 2021-02-10 09:09:00  24.8  132   0.00   106695 25.0 25.3 23.9  24.7  11342
##  [ reached getOption("max.print") -- omitted 158 rows ]

특정시각 (9시 1분 30초) 데이터만 뽑아내기

## # A tibble: 4 x 10
##   SYMBOL PRICE  SIZE RETURN CUM_SIZE  OPEN  HIGH   LOW CLOSE VOLUME
##   <chr>  <dbl> <dbl>  <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 a       25.2  1894   0.02    19096  24.7  25.9  24    24.6   7035
## 2 b       25     510  -0.04    22046  25.6  26    24.4  26    10203
## 3 c       24.8   778  -0.01    14542  25.6  25.7  24.2  25     5397
## 4 d       24.5   736   0       24100  25.4  25.4  24.2  24.4  11386

특정변수(e.g. CUMSIZE) 데이터만 뽑아내기

##                     a_CUMSIZE b_CUMSIZE c_CUMSIZE d_CUMSIZE
## 2021-02-10 08:59:00       804      1057      1035      1186
## 2021-02-10 09:00:00      2753      4243      4006      6387
## 2021-02-10 09:01:00     11203     11520      8383     13100
## 2021-02-10 09:02:00     19096     22046     14542     24100
## 2021-02-10 09:03:00     29299     32096     25895     30121
## 2021-02-10 09:04:00     38074     42421     35994     37361
## 2021-02-10 09:05:00     50177     49380     45551     46682
## 2021-02-10 09:06:00     60325     58576     57146     57228
## 2021-02-10 09:07:00     74214     78809     74476     74221
## 2021-02-10 09:08:00     96439     93411     94674     96125
## 2021-02-10 09:09:00    113258    112692    111750    106695
## 2021-02-10 09:10:00    130272    130752    125319    128332
## 2021-02-10 09:11:00    146698    144698    137938    145331
## 2021-02-10 09:12:00    169807    160840    159068    161871
## 2021-02-10 09:13:00    191094    182910    178741    182647
## 2021-02-10 09:14:00    213379    203789    209177    207248
## 2021-02-10 09:15:00    242385    226962    231152    236469
## 2021-02-10 09:16:00    266521    257290    255988    281434
## 2021-02-10 09:17:00    291544    288022    282345    313128
## 2021-02-10 09:18:00    317701    318782    315542    348439
## 2021-02-10 09:19:00    349159    351140    343622    377842
## 2021-02-10 09:20:00    368912    377258    359705    389858
## 2021-02-10 09:21:00    381319    384669    370520    401116
## 2021-02-10 09:22:00    394076    392737    380886    413790
## 2021-02-10 09:23:00    404279    399504    389862    422121
##  [ reached getOption("max.print") -- omitted 144 rows ]

뽑아낸 변수 데이터 Time Varying 순위 매기기

##                     a_CUMSIZE b_CUMSIZE c_CUMSIZE d_CUMSIZE
## 2021-02-10 08:59:00         4         2         3         1
## 2021-02-10 09:00:00         4         2         3         1
## 2021-02-10 09:01:00         3         2         4         1
## 2021-02-10 09:02:00         3         2         4         1
## 2021-02-10 09:03:00         3         1         4         2
## 2021-02-10 09:04:00         2         1         4         3
## 2021-02-10 09:05:00         1         2         4         3
## 2021-02-10 09:06:00         1         2         4         3
## 2021-02-10 09:07:00         4         1         2         3
## 2021-02-10 09:08:00         1         4         3         2
## 2021-02-10 09:09:00         1         2         3         4
## 2021-02-10 09:10:00         2         1         4         3
## 2021-02-10 09:11:00         1         3         4         2
## 2021-02-10 09:12:00         1         3         4         2
## 2021-02-10 09:13:00         1         2         4         3
## 2021-02-10 09:14:00         1         4         2         3
## 2021-02-10 09:15:00         1         4         3         2
## 2021-02-10 09:16:00         2         3         4         1
## 2021-02-10 09:17:00         2         3         4         1
## 2021-02-10 09:18:00         3         2         4         1
## 2021-02-10 09:19:00         3         2         4         1
## 2021-02-10 09:20:00         3         2         4         1
## 2021-02-10 09:21:00         3         2         4         1
## 2021-02-10 09:22:00         2         3         4         1
## 2021-02-10 09:23:00         2         3         4         1
##  [ reached getOption("max.print") -- omitted 144 rows ]

CO-MOVEMENT 질문: 첫 1분 동안 000 조건을 만족한 종목들은 지수와 유사하게 움직였을까?

Short-term changes are better interpreted from returns correlations, whilst valuations of long-term evolution may be improved using prices. So we choose return with rolling correlation for 10 minute window

Rolling (e.g. 10min) Pearson Corr

## # A tibble: 3 x 2
##   term     a_return
##   <chr>       <dbl>
## 1 b_return    0.091
## 2 c_return    0.08 
## 3 d_return    0.03