Overview

Brought to you by YData

Dataset statistics

Number of variables 29
Number of observations 294266
Missing cells 0
Missing cells (%) 0.0%
Duplicate rows 3
Duplicate rows (%) < 0.1%
Total size in memory 345.5 MiB
Average record size in memory 1.2 KiB

Variable types

Numeric 7
Text 6
Categorical 14
DateTime 2

Alerts

Dataset has 3 (< 0.1%) duplicate rows Duplicates
Amount is highly overall correlated with Total_Amount High correlation
Feedback is highly overall correlated with Ratings High correlation
Month is highly overall correlated with Year High correlation
Product_Brand is highly overall correlated with Product_Category and 1 other fields High correlation
Product_Category is highly overall correlated with Product_Brand and 1 other fields High correlation
Product_Type is highly overall correlated with Product_Brand and 1 other fields High correlation
Ratings is highly overall correlated with Feedback High correlation
Total_Amount is highly overall correlated with Amount and 1 other fields High correlation
Total_Purchases is highly overall correlated with Total_Amount High correlation
Year is highly overall correlated with Month High correlation

Reproduction

Analysis started 2025-09-11 12:42:35.338396
Analysis finished 2025-09-11 12:43:50.837846
Duration 1 minute and 15.5 seconds
Software version ydata-profiling vv4.16.1
Download configuration config.json

Variables

Transaction_ID
Real number (ℝ)

Distinct 287345
Distinct (%) 97.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 5493785.1
Minimum 1000007
Maximum 9999995
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:13:50.983783 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 1000007
5-th percentile 1446260
Q1 3245886.8
median 5495935
Q3 7738152.8
95-th percentile 9547991
Maximum 9999995
Range 8999988
Interquartile range (IQR) 4492266

Descriptive statistics

Standard deviation 2596084.5
Coefficient of variation (CV) 0.47254934
Kurtosis -1.1966785
Mean 5493785.1
Median Absolute Deviation (MAD) 2246218.5
Skewness 0.0025118782
Sum 1.6166342 × 1012
Variance 6.7396548 × 1012
Monotonicity Not monotonic
2025-09-11T18:13:51.155709 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
6145934 4
 
< 0.1%
3725822 3
 
< 0.1%
6565862 3
 
< 0.1%
9466196 3
 
< 0.1%
5840674 3
 
< 0.1%
3137725 3
 
< 0.1%
5459500 3
 
< 0.1%
9685446 3
 
< 0.1%
9703579 3
 
< 0.1%
9184711 3
 
< 0.1%
Other values (287335) 294235
> 99.9%
Value Count Frequency (%)
1000007 1
< 0.1%
1000043 1
< 0.1%
1000073 1
< 0.1%
1000088 1
< 0.1%
1000154 1
< 0.1%
1000183 1
< 0.1%
1000195 1
< 0.1%
1000219 1
< 0.1%
1000294 1
< 0.1%
1000307 1
< 0.1%
Value Count Frequency (%)
9999995 1
< 0.1%
9999945 1
< 0.1%
9999922 1
< 0.1%
9999909 1
< 0.1%
9999823 1
< 0.1%
9999786 1
< 0.1%
9999785 1
< 0.1%
9999782 1
< 0.1%
9999775 1
< 0.1%
9999647 1
< 0.1%

Customer_ID
Real number (ℝ)

Distinct 86501
Distinct (%) 29.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 55015.611
Minimum 10000
Maximum 99999
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:13:51.332823 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 10000
5-th percentile 14443
Q1 32470.25
median 55029
Q3 77518
95-th percentile 95587.75
Maximum 99999
Range 89999
Interquartile range (IQR) 45047.75

Descriptive statistics

Standard deviation 26010.244
Coefficient of variation (CV) 0.47277932
Kurtosis -1.2002798
Mean 55015.611
Median Absolute Deviation (MAD) 22524
Skewness 0.00017360816
Sum 1.6189224 × 1010
Variance 6.7653277 × 108
Monotonicity Not monotonic
2025-09-11T18:13:51.505195 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
99355 13
 
< 0.1%
60341 13
 
< 0.1%
47382 13
 
< 0.1%
49274 13
 
< 0.1%
43157 12
 
< 0.1%
56081 12
 
< 0.1%
49587 12
 
< 0.1%
48453 12
 
< 0.1%
36179 12
 
< 0.1%
80994 12
 
< 0.1%
Other values (86491) 294142
> 99.9%
Value Count Frequency (%)
10000 4
< 0.1%
10001 5
< 0.1%
10002 5
< 0.1%
10003 2
 
< 0.1%
10004 2
 
< 0.1%
10005 1
 
< 0.1%
10006 4
< 0.1%
10007 5
< 0.1%
10008 5
< 0.1%
10009 3
< 0.1%
Value Count Frequency (%)
99999 2
 
< 0.1%
99998 5
< 0.1%
99997 4
< 0.1%
99996 4
< 0.1%
99995 8
< 0.1%
99994 7
< 0.1%
99993 3
 
< 0.1%
99992 1
 
< 0.1%
99991 2
 
< 0.1%
99990 5
< 0.1%

Name
Text

Distinct 156752
Distinct (%) 53.3%
Missing 0
Missing (%) 0.0%
Memory size 17.5 MiB
2025-09-11T18:13:51.948591 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Length

Max length 29
Median length 27
Mean length 13.273817
Min length 6

Characters and Unicode

Total characters 3906033
Distinct characters 54
Distinct categories 4 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 105419 ?
Unique (%) 35.8%

Sample

1st row Michelle Harrington
2nd row Kelsey Hill
3rd row Scott Jensen
4th row Joseph Miller
5th row Debra Coleman
Value Count Frequency (%)
michael 6736
 
1.1%
smith 6332
 
1.1%
johnson 5029
 
0.8%
james 5013
 
0.8%
david 4674
 
0.8%
jennifer 4296
 
0.7%
christopher 4146
 
0.7%
john 4088
 
0.7%
williams 4046
 
0.7%
thomas 3982
 
0.7%
Other values (1588) 553445
92.0%
2025-09-11T18:13:52.561430 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
e 362487
 
9.3%
a 360722
 
9.2%
307521
 
7.9%
n 293525
 
7.5%
r 281732
 
7.2%
i 236593
 
6.1%
o 210195
 
5.4%
l 198439
 
5.1%
s 175202
 
4.5%
t 135801
 
3.5%
Other values (44) 1343816
34.4%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 2980898
76.3%
Uppercase Letter 611316
 
15.7%
Space Separator 307521
 
7.9%
Other Punctuation 6298
 
0.2%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 362487
12.2%
a 360722
12.1%
n 293525
9.8%
r 281732
9.5%
i 236593
 
7.9%
o 210195
 
7.1%
l 198439
 
6.7%
s 175202
 
5.9%
t 135801
 
4.6%
h 131447
 
4.4%
Other values (16) 594755
20.0%
Uppercase Letter
Value Count Frequency (%)
M 68721
 
11.2%
J 60124
 
9.8%
S 50042
 
8.2%
C 45410
 
7.4%
D 40904
 
6.7%
B 37965
 
6.2%
R 37855
 
6.2%
A 37406
 
6.1%
W 28920
 
4.7%
H 28444
 
4.7%
Other values (16) 175525
28.7%
Space Separator
Value Count Frequency (%)
307521
100.0%
Other Punctuation
Value Count Frequency (%)
. 6298
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 3592214
92.0%
Common 313819
 
8.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 362487
 
10.1%
a 360722
 
10.0%
n 293525
 
8.2%
r 281732
 
7.8%
i 236593
 
6.6%
o 210195
 
5.9%
l 198439
 
5.5%
s 175202
 
4.9%
t 135801
 
3.8%
h 131447
 
3.7%
Other values (42) 1206071
33.6%
Common
Value Count Frequency (%)
307521
98.0%
. 6298
 
2.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 3906033
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 362487
 
9.3%
a 360722
 
9.2%
307521
 
7.9%
n 293525
 
7.5%
r 281732
 
7.2%
i 236593
 
6.1%
o 210195
 
5.4%
l 198439
 
5.1%
s 175202
 
4.5%
t 135801
 
3.5%
Other values (44) 1343816
34.4%

Email
Text

Distinct 52549
Distinct (%) 17.9%
Missing 0
Missing (%) 0.0%
Memory size 18.8 MiB
2025-09-11T18:13:53.054585 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Length

Max length 24
Median length 22
Mean length 17.950949
Min length 13

Characters and Unicode

Total characters 5282354
Distinct characters 62
Distinct categories 4 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 14485 ?
Unique (%) 4.9%

Sample

1st row Ebony39@gmail.com
2nd row Mark36@gmail.com
3rd row Shane85@gmail.com
4th row Mary34@gmail.com
5th row Charles30@gmail.com
Value Count Frequency (%)
michael59@gmail.com 92
 
< 0.1%
michael95@gmail.com 91
 
< 0.1%
michael17@gmail.com 88
 
< 0.1%
michael39@gmail.com 85
 
< 0.1%
michael50@gmail.com 84
 
< 0.1%
michael58@gmail.com 83
 
< 0.1%
michael55@gmail.com 83
 
< 0.1%
michael40@gmail.com 82
 
< 0.1%
michael30@gmail.com 81
 
< 0.1%
michael93@gmail.com 81
 
< 0.1%
Other values (52539) 293416
99.7%
2025-09-11T18:13:53.541921 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
m 622301
11.8%
a 510140
 
9.7%
i 432917
 
8.2%
l 386809
 
7.3%
o 364935
 
6.9%
c 338618
 
6.4%
g 304708
 
5.8%
@ 294266
 
5.6%
. 294266
 
5.6%
e 187769
 
3.6%
Other values (52) 1545625
29.3%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 3834516
72.6%
Other Punctuation 588532
 
11.1%
Decimal Number 565040
 
10.7%
Uppercase Letter 294266
 
5.6%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
m 622301
16.2%
a 510140
13.3%
i 432917
11.3%
l 386809
10.1%
o 364935
9.5%
c 338618
8.8%
g 304708
7.9%
e 187769
 
4.9%
n 142704
 
3.7%
r 123763
 
3.2%
Other values (16) 419852
10.9%
Uppercase Letter
Value Count Frequency (%)
J 44240
15.0%
M 29901
10.2%
A 27433
9.3%
C 23466
8.0%
S 22952
 
7.8%
D 20972
 
7.1%
K 19287
 
6.6%
R 17626
 
6.0%
T 15801
 
5.4%
B 14440
 
4.9%
Other values (14) 58148
19.8%
Decimal Number
Value Count Frequency (%)
1 61613
10.9%
2 59154
10.5%
5 59121
10.5%
4 58919
10.4%
7 58874
10.4%
6 58798
10.4%
3 58774
10.4%
9 58570
10.4%
8 58502
10.4%
0 32715
5.8%
Other Punctuation
Value Count Frequency (%)
@ 294266
50.0%
. 294266
50.0%

Most occurring scripts

Value Count Frequency (%)
Latin 4128782
78.2%
Common 1153572
 
21.8%

Most frequent character per script

Latin
Value Count Frequency (%)
m 622301
15.1%
a 510140
12.4%
i 432917
10.5%
l 386809
9.4%
o 364935
8.8%
c 338618
8.2%
g 304708
7.4%
e 187769
 
4.5%
n 142704
 
3.5%
r 123763
 
3.0%
Other values (40) 714118
17.3%
Common
Value Count Frequency (%)
@ 294266
25.5%
. 294266
25.5%
1 61613
 
5.3%
2 59154
 
5.1%
5 59121
 
5.1%
4 58919
 
5.1%
7 58874
 
5.1%
6 58798
 
5.1%
3 58774
 
5.1%
9 58570
 
5.1%
Other values (2) 91217
 
7.9%

Most occurring blocks

Value Count Frequency (%)
ASCII 5282354
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
m 622301
11.8%
a 510140
 
9.7%
i 432917
 
8.2%
l 386809
 
7.3%
o 364935
 
6.9%
c 338618
 
6.4%
g 304708
 
5.8%
@ 294266
 
5.6%
. 294266
 
5.6%
e 187769
 
3.6%
Other values (52) 1545625
29.3%

Address
Text

Distinct 291963
Distinct (%) 99.2%
Missing 0
Missing (%) 0.0%
Memory size 20.1 MiB
2025-09-11T18:13:54.046970 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Length

Max length 38
Median length 32
Mean length 22.475393
Min length 10

Characters and Unicode

Total characters 6613744
Distinct characters 64
Distinct categories 5 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 289660 ?
Unique (%) 98.4%

Sample

1st row 3959 Amanda Burgs
2nd row 82072 Dawn Centers
3rd row 4133 Young Canyon
4th row 8148 Thomas Creek Suite 100
5th row 5813 Lori Ports Suite 269
Value Count Frequency (%)
apt 73909
 
6.3%
suite 73624
 
6.3%
michael 3360
 
0.3%
smith 3247
 
0.3%
stravenue 2667
 
0.2%
mission 2664
 
0.2%
drive 2661
 
0.2%
islands 2657
 
0.2%
summit 2655
 
0.2%
plains 2654
 
0.2%
Other values (75130) 1007766
85.6%
2025-09-11T18:13:54.667509 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
883598
 
13.4%
e 409955
 
6.2%
a 331008
 
5.0%
i 291629
 
4.4%
t 291290
 
4.4%
r 263693
 
4.0%
n 248008
 
3.7%
s 223736
 
3.4%
o 208890
 
3.2%
l 198545
 
3.0%
Other values (54) 3263392
49.3%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 3299628
49.9%
Decimal Number 1620544
24.5%
Space Separator 883598
 
13.4%
Uppercase Letter 736065
 
11.1%
Other Punctuation 73909
 
1.1%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 409955
12.4%
a 331008
10.0%
i 291629
8.8%
t 291290
8.8%
r 263693
 
8.0%
n 248008
 
7.5%
s 223736
 
6.8%
o 208890
 
6.3%
l 198545
 
6.0%
u 145414
 
4.4%
Other values (16) 687460
20.8%
Uppercase Letter
Value Count Frequency (%)
S 130476
17.7%
A 95116
12.9%
C 55708
 
7.6%
M 51083
 
6.9%
P 42135
 
5.7%
R 41220
 
5.6%
J 31701
 
4.3%
L 30658
 
4.2%
B 28368
 
3.9%
T 27417
 
3.7%
Other values (16) 202183
27.5%
Decimal Number
Value Count Frequency (%)
9 162478
10.0%
2 162464
10.0%
5 162457
10.0%
8 162372
10.0%
3 162342
10.0%
1 161977
10.0%
6 161963
10.0%
4 161940
10.0%
7 161520
10.0%
0 161031
9.9%
Space Separator
Value Count Frequency (%)
883598
100.0%
Other Punctuation
Value Count Frequency (%)
. 73909
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 4035693
61.0%
Common 2578051
39.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 409955
 
10.2%
a 331008
 
8.2%
i 291629
 
7.2%
t 291290
 
7.2%
r 263693
 
6.5%
n 248008
 
6.1%
s 223736
 
5.5%
o 208890
 
5.2%
l 198545
 
4.9%
u 145414
 
3.6%
Other values (42) 1423525
35.3%
Common
Value Count Frequency (%)
883598
34.3%
9 162478
 
6.3%
2 162464
 
6.3%
5 162457
 
6.3%
8 162372
 
6.3%
3 162342
 
6.3%
1 161977
 
6.3%
6 161963
 
6.3%
4 161940
 
6.3%
7 161520
 
6.3%
Other values (2) 234940
 
9.1%

Most occurring blocks

Value Count Frequency (%)
ASCII 6613744
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
883598
 
13.4%
e 409955
 
6.2%
a 331008
 
5.0%
i 291629
 
4.4%
t 291290
 
4.4%
r 263693
 
4.0%
n 248008
 
3.7%
s 223736
 
3.4%
o 208890
 
3.2%
l 198545
 
3.0%
Other values (54) 3263392
49.3%

City
Text

Distinct 130
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.2 MiB
2025-09-11T18:13:55.027488 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Length

Max length 19
Median length 14
Mean length 8.2868527
Min length 4

Characters and Unicode

Total characters 2438539
Distinct characters 53
Distinct categories 5 ?
Distinct scripts 2 ?
Distinct blocks 2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Dortmund
2nd row Nottingham
3rd row Geelong
4th row Edmonton
5th row Bristol
Value Count Frequency (%)
chicago 21143
 
6.3%
portsmouth 19679
 
5.8%
san 14561
 
4.3%
francisco 11954
 
3.5%
frankfurt 9963
 
2.9%
boston 9197
 
2.7%
new 6212
 
1.8%
york 5325
 
1.6%
fort 5096
 
1.5%
worth 5096
 
1.5%
Other values (132) 229886
68.0%
2025-09-11T18:13:55.863036 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
o 247597
 
10.2%
a 194752
 
8.0%
n 182786
 
7.5%
r 168599
 
6.9%
t 162326
 
6.7%
e 157200
 
6.4%
i 133476
 
5.5%
s 115451
 
4.7%
l 97072
 
4.0%
h 88199
 
3.6%
Other values (43) 891081
36.5%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 2049940
84.1%
Uppercase Letter 338136
 
13.9%
Space Separator 43846
 
1.8%
Other Punctuation 4400
 
0.2%
Dash Punctuation 2217
 
0.1%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
o 247597
12.1%
a 194752
 
9.5%
n 182786
 
8.9%
r 168599
 
8.2%
t 162326
 
7.9%
e 157200
 
7.7%
i 133476
 
6.5%
s 115451
 
5.6%
l 97072
 
4.7%
h 88199
 
4.3%
Other values (16) 502482
24.5%
Uppercase Letter
Value Count Frequency (%)
C 41660
12.3%
B 40558
12.0%
S 30471
 
9.0%
F 27897
 
8.3%
P 27598
 
8.2%
L 18985
 
5.6%
W 17913
 
5.3%
M 17536
 
5.2%
N 15838
 
4.7%
D 14320
 
4.2%
Other values (13) 85360
25.2%
Other Punctuation
Value Count Frequency (%)
. 2200
50.0%
' 2200
50.0%
Space Separator
Value Count Frequency (%)
43846
100.0%
Dash Punctuation
Value Count Frequency (%)
- 2217
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2388076
97.9%
Common 50463
 
2.1%

Most frequent character per script

Latin
Value Count Frequency (%)
o 247597
 
10.4%
a 194752
 
8.2%
n 182786
 
7.7%
r 168599
 
7.1%
t 162326
 
6.8%
e 157200
 
6.6%
i 133476
 
5.6%
s 115451
 
4.8%
l 97072
 
4.1%
h 88199
 
3.7%
Other values (39) 840618
35.2%
Common
Value Count Frequency (%)
43846
86.9%
- 2217
 
4.4%
. 2200
 
4.4%
' 2200
 
4.4%

Most occurring blocks

Value Count Frequency (%)
ASCII 2434148
99.8%
None 4391
 
0.2%

Most frequent character per block

ASCII
Value Count Frequency (%)
o 247597
 
10.2%
a 194752
 
8.0%
n 182786
 
7.5%
r 168599
 
6.9%
t 162326
 
6.7%
e 157200
 
6.5%
i 133476
 
5.5%
s 115451
 
4.7%
l 97072
 
4.0%
h 88199
 
3.6%
Other values (42) 886690
36.4%
None
Value Count Frequency (%)
ü 4391
100.0%

State
Text

Distinct 54
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.1 MiB
2025-09-11T18:13:56.123046 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Length

Max length 15
Median length 14
Mean length 8.4720151
Min length 4

Characters and Unicode

Total characters 2493026
Distinct characters 48
Distinct categories 3 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Berlin
2nd row England
3rd row New South Wales
4th row Ontario
5th row England
Value Count Frequency (%)
england 61470
15.5%
new 52050
13.1%
berlin 51492
13.0%
south 46026
11.6%
wales 44231
11.2%
ontario 44155
11.2%
connecticut 21154
 
5.3%
maine 11966
 
3.0%
georgia 9289
 
2.3%
kansas 5378
 
1.4%
Other values (46) 48657
12.3%
2025-09-11T18:13:56.544017 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
n 305892
 
12.3%
a 228960
 
9.2%
e 213012
 
8.5%
i 174559
 
7.0%
l 169734
 
6.8%
o 150446
 
6.0%
t 143854
 
5.8%
r 122719
 
4.9%
101602
 
4.1%
s 79767
 
3.2%
Other values (38) 802481
32.2%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1995556
80.0%
Uppercase Letter 395868
 
15.9%
Space Separator 101602
 
4.1%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
n 305892
15.3%
a 228960
11.5%
e 213012
10.7%
i 174559
8.7%
l 169734
8.5%
o 150446
7.5%
t 143854
7.2%
r 122719
 
6.1%
s 79767
 
4.0%
g 76065
 
3.8%
Other values (14) 330548
16.6%
Uppercase Letter
Value Count Frequency (%)
E 61470
15.5%
N 55594
14.0%
B 51492
13.0%
W 47741
12.1%
O 46786
11.8%
S 46026
11.6%
C 24740
6.2%
M 23341
 
5.9%
G 9289
 
2.3%
K 6252
 
1.6%
Other values (13) 23137
 
5.8%
Space Separator
Value Count Frequency (%)
101602
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2391424
95.9%
Common 101602
 
4.1%

Most frequent character per script

Latin
Value Count Frequency (%)
n 305892
12.8%
a 228960
 
9.6%
e 213012
 
8.9%
i 174559
 
7.3%
l 169734
 
7.1%
o 150446
 
6.3%
t 143854
 
6.0%
r 122719
 
5.1%
s 79767
 
3.3%
g 76065
 
3.2%
Other values (37) 726416
30.4%
Common
Value Count Frequency (%)
101602
100.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 2493026
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
n 305892
 
12.3%
a 228960
 
9.2%
e 213012
 
8.5%
i 174559
 
7.0%
l 169734
 
6.8%
o 150446
 
6.0%
t 143854
 
5.8%
r 122719
 
4.9%
101602
 
4.1%
s 79767
 
3.2%
Other values (38) 802481
32.2%

Zipcode
Real number (ℝ)

Distinct 93601
Distinct (%) 31.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 50286.171
Minimum 501
Maximum 99949
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:13:56.769918 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 501
5-th percentile 4794
Q1 25403
median 50585
Q3 75249
95-th percentile 95730.75
Maximum 99949
Range 99448
Interquartile range (IQR) 49846

Descriptive statistics

Standard deviation 28976.865
Coefficient of variation (CV) 0.57623923
Kurtosis -1.1955242
Mean 50286.171
Median Absolute Deviation (MAD) 24949
Skewness -0.010316388
Sum 1.4797511 × 1010
Variance 8.3965869 × 108
Monotonicity Not monotonic
2025-09-11T18:13:56.995993 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
68029 22
 
< 0.1%
2891 20
 
< 0.1%
2826 20
 
< 0.1%
68005 19
 
< 0.1%
68023 19
 
< 0.1%
2899 19
 
< 0.1%
68069 19
 
< 0.1%
68062 18
 
< 0.1%
68048 18
 
< 0.1%
68113 18
 
< 0.1%
Other values (93591) 294074
99.9%
Value Count Frequency (%)
501 2
< 0.1%
502 1
 
< 0.1%
503 1
 
< 0.1%
504 1
 
< 0.1%
505 2
< 0.1%
507 2
< 0.1%
509 3
< 0.1%
510 2
< 0.1%
511 2
< 0.1%
512 2
< 0.1%
Value Count Frequency (%)
99949 3
< 0.1%
99948 5
< 0.1%
99947 4
< 0.1%
99946 2
 
< 0.1%
99945 5
< 0.1%
99944 6
< 0.1%
99943 5
< 0.1%
99942 5
< 0.1%
99941 5
< 0.1%
99940 6
< 0.1%

Country
Categorical

Distinct 5
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.1 MiB
USA
92918 
UK
61470 
Germany
51492 
Australia
44231 
Canada
44155 

Length

Max length 9
Median length 7
Mean length 4.843057
Min length 2

Characters and Unicode

Total characters 1425147
Distinct characters 18
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Germany
2nd row UK
3rd row Australia
4th row Canada
5th row UK

Common Values

Value Count Frequency (%)
USA 92918
31.6%
UK 61470
20.9%
Germany 51492
17.5%
Australia 44231
15.0%
Canada 44155
15.0%

Length

2025-09-11T18:13:57.161995 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:13:57.337589 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
usa 92918
31.6%
uk 61470
20.9%
germany 51492
17.5%
australia 44231
15.0%
canada 44155
15.0%

Most occurring characters

Value Count Frequency (%)
a 272419
19.1%
U 154388
10.8%
A 137149
 
9.6%
r 95723
 
6.7%
n 95647
 
6.7%
S 92918
 
6.5%
K 61470
 
4.3%
G 51492
 
3.6%
e 51492
 
3.6%
m 51492
 
3.6%
Other values (8) 360957
25.3%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 883575
62.0%
Uppercase Letter 541572
38.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
a 272419
30.8%
r 95723
 
10.8%
n 95647
 
10.8%
e 51492
 
5.8%
m 51492
 
5.8%
y 51492
 
5.8%
u 44231
 
5.0%
s 44231
 
5.0%
t 44231
 
5.0%
l 44231
 
5.0%
Other values (2) 88386
 
10.0%
Uppercase Letter
Value Count Frequency (%)
U 154388
28.5%
A 137149
25.3%
S 92918
17.2%
K 61470
 
11.4%
G 51492
 
9.5%
C 44155
 
8.2%

Most occurring scripts

Value Count Frequency (%)
Latin 1425147
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
a 272419
19.1%
U 154388
10.8%
A 137149
 
9.6%
r 95723
 
6.7%
n 95647
 
6.7%
S 92918
 
6.5%
K 61470
 
4.3%
G 51492
 
3.6%
e 51492
 
3.6%
m 51492
 
3.6%
Other values (8) 360957
25.3%

Most occurring blocks

Value Count Frequency (%)
ASCII 1425147
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
a 272419
19.1%
U 154388
10.8%
A 137149
 
9.6%
r 95723
 
6.7%
n 95647
 
6.7%
S 92918
 
6.5%
K 61470
 
4.3%
G 51492
 
3.6%
e 51492
 
3.6%
m 51492
 
3.6%
Other values (8) 360957
25.3%

Age
Real number (ℝ)

Distinct 53
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 35.464906
Minimum 18
Maximum 70
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:13:57.533192 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 18
5-th percentile 19
Q1 22
median 32
Q3 46
95-th percentile 65
Maximum 70
Range 52
Interquartile range (IQR) 24

Descriptive statistics

Standard deviation 15.017996
Coefficient of variation (CV) 0.42346077
Kurtosis -0.80549695
Mean 35.464906
Median Absolute Deviation (MAD) 12
Skewness 0.6543886
Sum 10436116
Variance 225.54022
Monotonicity Not monotonic
2025-09-11T18:13:57.719182 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
20 33765
 
11.5%
46 29874
 
10.2%
26 24139
 
8.2%
22 22450
 
7.6%
34 20188
 
6.9%
23 17631
 
6.0%
19 16474
 
5.6%
21 7871
 
2.7%
24 6032
 
2.0%
55 5324
 
1.8%
Other values (43) 110518
37.6%
Value Count Frequency (%)
18 2427
 
0.8%
19 16474
5.6%
20 33765
11.5%
21 7871
 
2.7%
22 22450
7.6%
23 17631
6.0%
24 6032
 
2.0%
25 2510
 
0.9%
26 24139
8.2%
27 2509
 
0.9%
Value Count Frequency (%)
70 2662
0.9%
69 2508
0.9%
68 2582
0.9%
67 2478
0.8%
66 2426
0.8%
65 2556
0.9%
64 2576
0.9%
63 2499
0.8%
62 2547
0.9%
61 2520
0.9%

Gender
Categorical

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.1 MiB
Male
182989 
Female
111277 

Length

Max length 6
Median length 4
Mean length 4.7563021
Min length 4

Characters and Unicode

Total characters 1399618
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Male
2nd row Female
3rd row Male
4th row Male
5th row Male

Common Values

Value Count Frequency (%)
Male 182989
62.2%
Female 111277
37.8%

Length

2025-09-11T18:13:57.923084 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:13:58.055402 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
male 182989
62.2%
female 111277
37.8%

Most occurring characters

Value Count Frequency (%)
e 405543
29.0%
a 294266
21.0%
l 294266
21.0%
M 182989
13.1%
F 111277
 
8.0%
m 111277
 
8.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1105352
79.0%
Uppercase Letter 294266
 
21.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 405543
36.7%
a 294266
26.6%
l 294266
26.6%
m 111277
 
10.1%
Uppercase Letter
Value Count Frequency (%)
M 182989
62.2%
F 111277
37.8%

Most occurring scripts

Value Count Frequency (%)
Latin 1399618
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 405543
29.0%
a 294266
21.0%
l 294266
21.0%
M 182989
13.1%
F 111277
 
8.0%
m 111277
 
8.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 1399618
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 405543
29.0%
a 294266
21.0%
l 294266
21.0%
M 182989
13.1%
F 111277
 
8.0%
m 111277
 
8.0%

Income
Categorical

Distinct 3
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.0 MiB
Medium
127046 
Low
93795 
High
73425 

Length

Max length 6
Median length 4
Mean length 4.544735
Min length 3

Characters and Unicode

Total characters 1337361
Distinct characters 12
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Low
2nd row Low
3rd row Low
4th row High
5th row Low

Common Values

Value Count Frequency (%)
Medium 127046
43.2%
Low 93795
31.9%
High 73425
25.0%

Length

2025-09-11T18:13:58.197499 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:13:58.354398 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
medium 127046
43.2%
low 93795
31.9%
high 73425
25.0%

Most occurring characters

Value Count Frequency (%)
i 200471
15.0%
M 127046
9.5%
e 127046
9.5%
d 127046
9.5%
u 127046
9.5%
m 127046
9.5%
L 93795
7.0%
o 93795
7.0%
w 93795
7.0%
H 73425
 
5.5%
Other values (2) 146850
11.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1043095
78.0%
Uppercase Letter 294266
 
22.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
i 200471
19.2%
e 127046
12.2%
d 127046
12.2%
u 127046
12.2%
m 127046
12.2%
o 93795
9.0%
w 93795
9.0%
g 73425
 
7.0%
h 73425
 
7.0%
Uppercase Letter
Value Count Frequency (%)
M 127046
43.2%
L 93795
31.9%
H 73425
25.0%

Most occurring scripts

Value Count Frequency (%)
Latin 1337361
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
i 200471
15.0%
M 127046
9.5%
e 127046
9.5%
d 127046
9.5%
u 127046
9.5%
m 127046
9.5%
L 93795
7.0%
o 93795
7.0%
w 93795
7.0%
H 73425
 
5.5%
Other values (2) 146850
11.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 1337361
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
i 200471
15.0%
M 127046
9.5%
e 127046
9.5%
d 127046
9.5%
u 127046
9.5%
m 127046
9.5%
L 93795
7.0%
o 93795
7.0%
w 93795
7.0%
H 73425
 
5.5%
Other values (2) 146850
11.0%

Customer_Segment
Categorical

Distinct 3
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.4 MiB
Regular
142721 
New
88867 
Premium
62678 

Length

Max length 7
Median length 7
Mean length 5.7920181
Min length 3

Characters and Unicode

Total characters 1704394
Distinct characters 12
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Regular
2nd row Premium
3rd row Regular
4th row Premium
5th row Premium

Common Values

Value Count Frequency (%)
Regular 142721
48.5%
New 88867
30.2%
Premium 62678
21.3%

Length

2025-09-11T18:13:58.515386 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:13:58.646649 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
regular 142721
48.5%
new 88867
30.2%
premium 62678
21.3%

Most occurring characters

Value Count Frequency (%)
e 294266
17.3%
u 205399
12.1%
r 205399
12.1%
R 142721
8.4%
g 142721
8.4%
l 142721
8.4%
a 142721
8.4%
m 125356
7.4%
N 88867
 
5.2%
w 88867
 
5.2%
Other values (2) 125356
7.4%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1410128
82.7%
Uppercase Letter 294266
 
17.3%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 294266
20.9%
u 205399
14.6%
r 205399
14.6%
g 142721
10.1%
l 142721
10.1%
a 142721
10.1%
m 125356
8.9%
w 88867
 
6.3%
i 62678
 
4.4%
Uppercase Letter
Value Count Frequency (%)
R 142721
48.5%
N 88867
30.2%
P 62678
21.3%

Most occurring scripts

Value Count Frequency (%)
Latin 1704394
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 294266
17.3%
u 205399
12.1%
r 205399
12.1%
R 142721
8.4%
g 142721
8.4%
l 142721
8.4%
a 142721
8.4%
m 125356
7.4%
N 88867
 
5.2%
w 88867
 
5.2%
Other values (2) 125356
7.4%

Most occurring blocks

Value Count Frequency (%)
ASCII 1704394
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 294266
17.3%
u 205399
12.1%
r 205399
12.1%
R 142721
8.4%
g 142721
8.4%
l 142721
8.4%
a 142721
8.4%
m 125356
7.4%
N 88867
 
5.2%
w 88867
 
5.2%
Other values (2) 125356
7.4%

Date
Date

Distinct 366
Distinct (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory size 2.2 MiB
Minimum 2023-03-01 00:00:00
Maximum 2024-02-29 00:00:00
Invalid dates 0
Invalid dates (%) 0.0%
2025-09-11T18:13:58.805951 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:59.023292 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Year
Categorical

High correlation 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.4 MiB
2023.0
245676 
2024.0
48590 

Length

Max length 6
Median length 6
Mean length 6
Min length 6

Characters and Unicode

Total characters 1765596
Distinct characters 5
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 2023.0
2nd row 2023.0
3rd row 2023.0
4th row 2023.0
5th row 2024.0

Common Values

Value Count Frequency (%)
2023.0 245676
83.5%
2024.0 48590
 
16.5%

Length

2025-09-11T18:13:59.223189 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:13:59.342193 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
2023.0 245676
83.5%
2024.0 48590
 
16.5%

Most occurring characters

Value Count Frequency (%)
2 588532
33.3%
0 588532
33.3%
. 294266
16.7%
3 245676
13.9%
4 48590
 
2.8%

Most occurring categories

Value Count Frequency (%)
Decimal Number 1471330
83.3%
Other Punctuation 294266
 
16.7%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
2 588532
40.0%
0 588532
40.0%
3 245676
16.7%
4 48590
 
3.3%
Other Punctuation
Value Count Frequency (%)
. 294266
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 1765596
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
2 588532
33.3%
0 588532
33.3%
. 294266
16.7%
3 245676
13.9%
4 48590
 
2.8%

Most occurring blocks

Value Count Frequency (%)
ASCII 1765596
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
2 588532
33.3%
0 588532
33.3%
. 294266
16.7%
3 245676
13.9%
4 48590
 
2.8%

Month
Categorical

High correlation 

Distinct 12
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.4 MiB
April
40263 
January
36460 
August
32149 
July
30101 
May
27624 
Other values (7)
127669 

Length

Max length 9
Median length 7
Mean length 5.9327989
Min length 3

Characters and Unicode

Total characters 1745821
Distinct characters 26
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row September
2nd row December
3rd row April
4th row May
5th row January

Common Values

Value Count Frequency (%)
April 40263
13.7%
January 36460
12.4%
August 32149
10.9%
July 30101
10.2%
May 27624
9.4%
March 18678
6.3%
October 18655
6.3%
December 18477
6.3%
September 18189
6.2%
November 17943
6.1%
Other values (2) 35727
12.1%

Length

2025-09-11T18:13:59.482290 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
april 40263
13.7%
january 36460
12.4%
august 32149
10.9%
july 30101
10.2%
may 27624
9.4%
march 18678
6.3%
october 18655
6.3%
december 18477
6.3%
september 18189
6.2%
november 17943
6.1%
Other values (2) 35727
12.1%

Most occurring characters

Value Count Frequency (%)
r 204251
 
11.7%
e 200266
 
11.5%
u 166586
 
9.5%
a 137015
 
7.8%
y 111978
 
6.4%
b 91057
 
5.2%
J 84495
 
4.8%
A 72412
 
4.1%
l 70364
 
4.0%
t 68993
 
4.0%
Other values (16) 538404
30.8%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1451555
83.1%
Uppercase Letter 294266
 
16.9%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
r 204251
14.1%
e 200266
13.8%
u 166586
11.5%
a 137015
9.4%
y 111978
 
7.7%
b 91057
 
6.3%
l 70364
 
4.8%
t 68993
 
4.8%
p 58452
 
4.0%
c 55810
 
3.8%
Other values (8) 286783
19.8%
Uppercase Letter
Value Count Frequency (%)
J 84495
28.7%
A 72412
24.6%
M 46302
15.7%
O 18655
 
6.3%
D 18477
 
6.3%
S 18189
 
6.2%
N 17943
 
6.1%
F 17793
 
6.0%

Most occurring scripts

Value Count Frequency (%)
Latin 1745821
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
r 204251
 
11.7%
e 200266
 
11.5%
u 166586
 
9.5%
a 137015
 
7.8%
y 111978
 
6.4%
b 91057
 
5.2%
J 84495
 
4.8%
A 72412
 
4.1%
l 70364
 
4.0%
t 68993
 
4.0%
Other values (16) 538404
30.8%

Most occurring blocks

Value Count Frequency (%)
ASCII 1745821
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
r 204251
 
11.7%
e 200266
 
11.5%
u 166586
 
9.5%
a 137015
 
7.8%
y 111978
 
6.4%
b 91057
 
5.2%
J 84495
 
4.8%
A 72412
 
4.1%
l 70364
 
4.0%
t 68993
 
4.0%
Other values (16) 538404
30.8%

Time
Date

Distinct 83446
Distinct (%) 28.4%
Missing 0
Missing (%) 0.0%
Memory size 2.2 MiB
Minimum 2025-09-11 00:00:00
Maximum 2025-09-11 23:59:59
Invalid dates 0
Invalid dates (%) 0.0%
2025-09-11T18:13:59.678281 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:59.851270 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Total_Purchases
Real number (ℝ)

High correlation 

Distinct 10
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 5.3597493
Minimum 1
Maximum 10
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:13:59.980259 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 3
median 5
Q3 8
95-th percentile 10
Maximum 10
Range 9
Interquartile range (IQR) 5

Descriptive statistics

Standard deviation 2.8684169
Coefficient of variation (CV) 0.53517744
Kurtosis -1.2164131
Mean 5.3597493
Median Absolute Deviation (MAD) 2
Skewness 0.072392289
Sum 1577192
Variance 8.2278158
Monotonicity Not monotonic
2025-09-11T18:14:00.115706 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
Value Count Frequency (%)
5 31152
10.6%
3 31119
10.6%
2 31116
10.6%
1 31094
10.6%
4 30813
10.5%
8 28000
9.5%
6 27811
9.5%
9 27789
9.4%
7 27730
9.4%
10 27642
9.4%
Value Count Frequency (%)
1 31094
10.6%
2 31116
10.6%
3 31119
10.6%
4 30813
10.5%
5 31152
10.6%
6 27811
9.5%
7 27730
9.4%
8 28000
9.5%
9 27789
9.4%
10 27642
9.4%
Value Count Frequency (%)
10 27642
9.4%
9 27789
9.4%
8 28000
9.5%
7 27730
9.4%
6 27811
9.5%
5 31152
10.6%
4 30813
10.5%
3 31119
10.6%
2 31116
10.6%
1 31094
10.6%

Amount
Real number (ℝ)

High correlation 

Distinct 291968
Distinct (%) 99.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 255.15457
Minimum 10.000219
Maximum 499.99791
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:14:00.306612 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 10.000219
5-th percentile 34.372815
Q1 132.84523
median 255.45992
Q3 377.63719
95-th percentile 475.59507
Maximum 499.99791
Range 489.99769
Interquartile range (IQR) 244.79195

Descriptive statistics

Standard deviation 141.39032
Coefficient of variation (CV) 0.55413596
Kurtosis -1.1985938
Mean 255.15457
Median Absolute Deviation (MAD) 122.4145
Skewness -0.0021230239
Sum 75083316
Variance 19991.224
Monotonicity Not monotonic
2025-09-11T18:14:00.505602 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
429.1301809 2
 
< 0.1%
132.0189587 2
 
< 0.1%
211.2883556 2
 
< 0.1%
121.1225448 2
 
< 0.1%
130.459304 2
 
< 0.1%
186.1022091 2
 
< 0.1%
254.3104478 2
 
< 0.1%
157.2364982 2
 
< 0.1%
255.2514394 2
 
< 0.1%
457.7330586 2
 
< 0.1%
Other values (291958) 294246
> 99.9%
Value Count Frequency (%)
10.00021923 1
< 0.1%
10.00075279 1
< 0.1%
10.00374959 1
< 0.1%
10.00610943 1
< 0.1%
10.00676942 1
< 0.1%
10.00693081 1
< 0.1%
10.0089709 1
< 0.1%
10.01000385 1
< 0.1%
10.01133597 1
< 0.1%
10.01160326 1
< 0.1%
Value Count Frequency (%)
499.997911 1
< 0.1%
499.9970238 1
< 0.1%
499.9966805 1
< 0.1%
499.995511 1
< 0.1%
499.9938845 1
< 0.1%
499.9893306 1
< 0.1%
499.9890825 1
< 0.1%
499.9883822 1
< 0.1%
499.9763531 1
< 0.1%
499.9756795 1
< 0.1%

Total_Amount
Real number (ℝ)

High correlation 

Distinct 291971
Distinct (%) 99.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 1367.6375
Minimum 10.00375
Maximum 4999.6258
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 2.2 MiB
2025-09-11T18:14:00.691602 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum 10.00375
5-th percentile 114.80922
Q1 438.79323
median 1041.1694
Q3 2028.9568
95-th percentile 3689.8732
Maximum 4999.6258
Range 4989.622
Interquartile range (IQR) 1590.1636

Descriptive statistics

Standard deviation 1128.8728
Coefficient of variation (CV) 0.82541808
Kurtosis 0.17504498
Mean 1367.6375
Median Absolute Deviation (MAD) 701.6355
Skewness 0.97290504
Sum 4.0244923 × 108
Variance 1274353.7
Monotonicity Not monotonic
2025-09-11T18:14:00.922756 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
812.8175565 2
 
< 0.1%
3893.424561 2
 
< 0.1%
4180.46868 2
 
< 0.1%
2115.765127 2
 
< 0.1%
264.798718 2
 
< 0.1%
1318.898194 2
 
< 0.1%
989.9091955 2
 
< 0.1%
292.5440492 2
 
< 0.1%
41.31672915 2
 
< 0.1%
1094.293599 2
 
< 0.1%
Other values (291961) 294246
> 99.9%
Value Count Frequency (%)
10.00374959 1
< 0.1%
10.01133597 1
< 0.1%
10.05635304 1
< 0.1%
10.06326938 1
< 0.1%
10.06815366 1
< 0.1%
10.09296593 1
< 0.1%
10.13350038 1
< 0.1%
10.13641713 1
< 0.1%
10.17207819 1
< 0.1%
10.19883003 1
< 0.1%
Value Count Frequency (%)
4999.625796 1
< 0.1%
4999.340097 1
< 0.1%
4999.171428 1
< 0.1%
4998.723479 1
< 0.1%
4998.603558 1
< 0.1%
4998.473906 1
< 0.1%
4998.306569 1
< 0.1%
4998.204389 1
< 0.1%
4997.986042 1
< 0.1%
4997.933548 1
< 0.1%

Product_Category
Categorical

High correlation 

Distinct 5
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.1 MiB
Electronics
69425 
Grocery
65208 
Clothing
53361 
Books
53257 
Home Decor
53015 

Length

Max length 11
Median length 8
Mean length 8.3035553
Min length 5

Characters and Unicode

Total characters 2443454
Distinct characters 21
Distinct categories 3 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Clothing
2nd row Electronics
3rd row Books
4th row Home Decor
5th row Grocery

Common Values

Value Count Frequency (%)
Electronics 69425
23.6%
Grocery 65208
22.2%
Clothing 53361
18.1%
Books 53257
18.1%
Home Decor 53015
18.0%

Length

2025-09-11T18:14:01.093590 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:14:01.268492 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
electronics 69425
20.0%
grocery 65208
18.8%
clothing 53361
15.4%
books 53257
15.3%
home 53015
15.3%
decor 53015
15.3%

Most occurring characters

Value Count Frequency (%)
o 400538
16.4%
c 257073
10.5%
r 252856
10.3%
e 240663
 
9.8%
l 122786
 
5.0%
n 122786
 
5.0%
t 122786
 
5.0%
i 122786
 
5.0%
s 122682
 
5.0%
E 69425
 
2.8%
Other values (11) 609073
24.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 2043158
83.6%
Uppercase Letter 347281
 
14.2%
Space Separator 53015
 
2.2%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
o 400538
19.6%
c 257073
12.6%
r 252856
12.4%
e 240663
11.8%
l 122786
 
6.0%
n 122786
 
6.0%
t 122786
 
6.0%
i 122786
 
6.0%
s 122682
 
6.0%
y 65208
 
3.2%
Other values (4) 212994
10.4%
Uppercase Letter
Value Count Frequency (%)
E 69425
20.0%
G 65208
18.8%
C 53361
15.4%
B 53257
15.3%
H 53015
15.3%
D 53015
15.3%
Space Separator
Value Count Frequency (%)
53015
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2390439
97.8%
Common 53015
 
2.2%

Most frequent character per script

Latin
Value Count Frequency (%)
o 400538
16.8%
c 257073
10.8%
r 252856
10.6%
e 240663
10.1%
l 122786
 
5.1%
n 122786
 
5.1%
t 122786
 
5.1%
i 122786
 
5.1%
s 122682
 
5.1%
E 69425
 
2.9%
Other values (10) 556058
23.3%
Common
Value Count Frequency (%)
53015
100.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 2443454
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
o 400538
16.4%
c 257073
10.5%
r 252856
10.3%
e 240663
 
9.8%
l 122786
 
5.0%
n 122786
 
5.0%
t 122786
 
5.0%
i 122786
 
5.0%
s 122682
 
5.0%
E 69425
 
2.8%
Other values (11) 609073
24.9%

Product_Brand
Categorical

High correlation 

Distinct 18
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.0 MiB
Pepsi
29577 
Coca-Cola
 
17965
HarperCollins
 
17918
Zara
 
17901
Samsung
 
17879
Other values (13)
193026 

Length

Max length 17
Median length 12
Mean length 7.8918462
Min length 4

Characters and Unicode

Total characters 2322302
Distinct characters 37
Distinct categories 5 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Nike
2nd row Samsung
3rd row Penguin Books
4th row Home Depot
5th row Nestle

Common Values

Value Count Frequency (%)
Pepsi 29577
 
10.1%
Coca-Cola 17965
 
6.1%
HarperCollins 17918
 
6.1%
Zara 17901
 
6.1%
Samsung 17879
 
6.1%
Sony 17864
 
6.1%
Adidas 17791
 
6.0%
Bed Bath & Beyond 17775
 
6.0%
Home Depot 17698
 
6.0%
Random House 17696
 
6.0%
Other values (8) 104202
35.4%

Length

2025-09-11T18:14:01.446488 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
pepsi 29577
 
7.4%
coca-cola 17965
 
4.5%
harpercollins 17918
 
4.5%
zara 17901
 
4.5%
samsung 17879
 
4.5%
sony 17864
 
4.5%
adidas 17791
 
4.4%
bed 17775
 
4.4%
bath 17775
 
4.4%
17775
 
4.4%
Other values (14) 210408
52.5%

Most occurring characters

Value Count Frequency (%)
e 233878
 
10.1%
o 210145
 
9.0%
a 163001
 
7.0%
s 149340
 
6.4%
i 127645
 
5.5%
n 124418
 
5.4%
p 107675
 
4.6%
106362
 
4.6%
l 98564
 
4.2%
d 88828
 
3.8%
Other values (27) 912446
39.3%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1706628
73.5%
Uppercase Letter 473572
 
20.4%
Space Separator 106362
 
4.6%
Dash Punctuation 17965
 
0.8%
Other Punctuation 17775
 
0.8%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 233878
13.7%
o 210145
12.3%
a 163001
9.6%
s 149340
8.8%
i 127645
 
7.5%
n 124418
 
7.3%
p 107675
 
6.3%
l 98564
 
5.8%
d 88828
 
5.2%
r 63239
 
3.7%
Other values (9) 339895
19.9%
Uppercase Letter
Value Count Frequency (%)
B 73178
15.5%
C 53848
11.4%
H 53312
11.3%
A 52928
11.2%
P 47220
10.0%
S 37953
8.0%
N 35335
7.5%
Z 17901
 
3.8%
D 17698
 
3.7%
R 17696
 
3.7%
Other values (5) 66503
14.0%
Space Separator
Value Count Frequency (%)
106362
100.0%
Dash Punctuation
Value Count Frequency (%)
- 17965
100.0%
Other Punctuation
Value Count Frequency (%)
& 17775
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2180200
93.9%
Common 142102
 
6.1%

Most frequent character per script

Latin
Value Count Frequency (%)
e 233878
 
10.7%
o 210145
 
9.6%
a 163001
 
7.5%
s 149340
 
6.8%
i 127645
 
5.9%
n 124418
 
5.7%
p 107675
 
4.9%
l 98564
 
4.5%
d 88828
 
4.1%
B 73178
 
3.4%
Other values (24) 803528
36.9%
Common
Value Count Frequency (%)
106362
74.8%
- 17965
 
12.6%
& 17775
 
12.5%

Most occurring blocks

Value Count Frequency (%)
ASCII 2322302
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 233878
 
10.1%
o 210145
 
9.0%
a 163001
 
7.0%
s 149340
 
6.4%
i 127645
 
5.5%
n 124418
 
5.4%
p 107675
 
4.6%
106362
 
4.6%
l 98564
 
4.2%
d 88828
 
3.8%
Other values (27) 912446
39.3%

Product_Type
Categorical

High correlation 

Distinct 33
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.0 MiB
Water
23872 
Smartphone
 
17951
Non-Fiction
 
17697
Fiction
 
17561
Juice
 
11932
Other values (28)
205253 

Length

Max length 34
Median length 10
Mean length 8.1726125
Min length 5

Characters and Unicode

Total characters 2404922
Distinct characters 41
Distinct categories 6 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Shorts
2nd row Tablet
3rd row Children's
4th row Tools
5th row Chocolate

Common Values

Value Count Frequency (%)
Water 23872
 
8.1%
Smartphone 17951
 
6.1%
Non-Fiction 17697
 
6.0%
Fiction 17561
 
6.0%
Juice 11932
 
4.1%
T-shirt 11887
 
4.0%
Television 11880
 
4.0%
Decorations 11855
 
4.0%
Shoes 11829
 
4.0%
Tablet 11764
 
4.0%
Other values (23) 146038
49.6%

Length

2025-09-11T18:14:01.635586 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
water 23872
 
6.9%
smartphone 17951
 
5.2%
non-fiction 17697
 
5.1%
fiction 17561
 
5.1%
juice 11932
 
3.4%
t-shirt 11887
 
3.4%
television 11880
 
3.4%
decorations 11855
 
3.4%
shoes 11829
 
3.4%
tablet 11764
 
3.4%
Other values (30) 199496
57.4%

Most occurring characters

Value Count Frequency (%)
i 234709
 
9.8%
e 217540
 
9.0%
t 217067
 
9.0%
o 195453
 
8.1%
r 170473
 
7.1%
n 166078
 
6.9%
a 121211
 
5.0%
s 107844
 
4.5%
h 101878
 
4.2%
c 82585
 
3.4%
Other values (31) 790084
32.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1926310
80.1%
Uppercase Letter 363256
 
15.1%
Space Separator 53458
 
2.2%
Dash Punctuation 29584
 
1.2%
Decimal Number 19755
 
0.8%
Other Punctuation 12559
 
0.5%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
i 234709
12.2%
e 217540
11.3%
t 217067
11.3%
o 195453
10.1%
r 170473
8.8%
n 166078
8.6%
a 121211
 
6.3%
s 107844
 
5.6%
h 101878
 
5.3%
c 82585
 
4.3%
Other values (10) 311472
16.2%
Uppercase Letter
Value Count Frequency (%)
S 74499
20.5%
F 54234
14.9%
T 54020
14.9%
D 29556
 
8.1%
C 26706
 
7.4%
W 23872
 
6.6%
J 23742
 
6.5%
N 17697
 
4.9%
L 17644
 
4.9%
B 13935
 
3.8%
Other values (4) 27351
 
7.5%
Decimal Number
Value Count Frequency (%)
5 6585
33.3%
1 6585
33.3%
3 6585
33.3%
Other Punctuation
Value Count Frequency (%)
. 6585
52.4%
' 5974
47.6%
Space Separator
Value Count Frequency (%)
53458
100.0%
Dash Punctuation
Value Count Frequency (%)
- 29584
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2289566
95.2%
Common 115356
 
4.8%

Most frequent character per script

Latin
Value Count Frequency (%)
i 234709
 
10.3%
e 217540
 
9.5%
t 217067
 
9.5%
o 195453
 
8.5%
r 170473
 
7.4%
n 166078
 
7.3%
a 121211
 
5.3%
s 107844
 
4.7%
h 101878
 
4.4%
c 82585
 
3.6%
Other values (24) 674728
29.5%
Common
Value Count Frequency (%)
53458
46.3%
- 29584
25.6%
5 6585
 
5.7%
. 6585
 
5.7%
1 6585
 
5.7%
3 6585
 
5.7%
' 5974
 
5.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 2404922
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
i 234709
 
9.8%
e 217540
 
9.0%
t 217067
 
9.0%
o 195453
 
8.1%
r 170473
 
7.1%
n 166078
 
6.9%
a 121211
 
5.0%
s 107844
 
4.5%
h 101878
 
4.2%
c 82585
 
3.4%
Other values (31) 790084
32.9%

Feedback
Categorical

High correlation 

Distinct 4
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.5 MiB
Excellent
98150 
Good
92796 
Average
61094 
Bad
42226 

Length

Max length 9
Median length 7
Mean length 6.1470574
Min length 3

Characters and Unicode

Total characters 1808870
Distinct characters 16
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Excellent
2nd row Excellent
3rd row Average
4th row Excellent
5th row Bad

Common Values

Value Count Frequency (%)
Excellent 98150
33.4%
Good 92796
31.5%
Average 61094
20.8%
Bad 42226
14.3%

Length

2025-09-11T18:14:01.813668 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:14:01.955678 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
excellent 98150
33.4%
good 92796
31.5%
average 61094
20.8%
bad 42226
14.3%

Most occurring characters

Value Count Frequency (%)
e 318488
17.6%
l 196300
10.9%
o 185592
10.3%
d 135022
 
7.5%
a 103320
 
5.7%
c 98150
 
5.4%
x 98150
 
5.4%
E 98150
 
5.4%
n 98150
 
5.4%
t 98150
 
5.4%
Other values (6) 379398
21.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1514604
83.7%
Uppercase Letter 294266
 
16.3%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 318488
21.0%
l 196300
13.0%
o 185592
12.3%
d 135022
8.9%
a 103320
 
6.8%
c 98150
 
6.5%
x 98150
 
6.5%
n 98150
 
6.5%
t 98150
 
6.5%
v 61094
 
4.0%
Other values (2) 122188
 
8.1%
Uppercase Letter
Value Count Frequency (%)
E 98150
33.4%
G 92796
31.5%
A 61094
20.8%
B 42226
14.3%

Most occurring scripts

Value Count Frequency (%)
Latin 1808870
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 318488
17.6%
l 196300
10.9%
o 185592
10.3%
d 135022
 
7.5%
a 103320
 
5.7%
c 98150
 
5.4%
x 98150
 
5.4%
E 98150
 
5.4%
n 98150
 
5.4%
t 98150
 
5.4%
Other values (6) 379398
21.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 1808870
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 318488
17.6%
l 196300
10.9%
o 185592
10.3%
d 135022
 
7.5%
a 103320
 
5.7%
c 98150
 
5.4%
x 98150
 
5.4%
E 98150
 
5.4%
n 98150
 
5.4%
t 98150
 
5.4%
Other values (6) 379398
21.0%

Shipping_Method
Categorical

Distinct 3
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 15.9 MiB
Same-Day
101653 
Express
99722 
Standard
92891 

Length

Max length 8
Median length 8
Mean length 7.6611161
Min length 7

Characters and Unicode

Total characters 2254406
Distinct characters 15
Distinct categories 3 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Same-Day
2nd row Standard
3rd row Same-Day
4th row Standard
5th row Standard

Common Values

Value Count Frequency (%)
Same-Day 101653
34.5%
Express 99722
33.9%
Standard 92891
31.6%

Length

2025-09-11T18:14:02.115668 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:14:02.228658 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
same-day 101653
34.5%
express 99722
33.9%
standard 92891
31.6%

Most occurring characters

Value Count Frequency (%)
a 389088
17.3%
e 201375
8.9%
s 199444
8.8%
S 194544
8.6%
r 192613
 
8.5%
d 185782
 
8.2%
m 101653
 
4.5%
y 101653
 
4.5%
D 101653
 
4.5%
- 101653
 
4.5%
Other values (5) 484948
21.5%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1756834
77.9%
Uppercase Letter 395919
 
17.6%
Dash Punctuation 101653
 
4.5%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
a 389088
22.1%
e 201375
11.5%
s 199444
11.4%
r 192613
11.0%
d 185782
10.6%
m 101653
 
5.8%
y 101653
 
5.8%
p 99722
 
5.7%
x 99722
 
5.7%
t 92891
 
5.3%
Uppercase Letter
Value Count Frequency (%)
S 194544
49.1%
D 101653
25.7%
E 99722
25.2%
Dash Punctuation
Value Count Frequency (%)
- 101653
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2152753
95.5%
Common 101653
 
4.5%

Most frequent character per script

Latin
Value Count Frequency (%)
a 389088
18.1%
e 201375
9.4%
s 199444
9.3%
S 194544
9.0%
r 192613
8.9%
d 185782
8.6%
m 101653
 
4.7%
y 101653
 
4.7%
D 101653
 
4.7%
E 99722
 
4.6%
Other values (4) 385226
17.9%
Common
Value Count Frequency (%)
- 101653
100.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 2254406
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
a 389088
17.3%
e 201375
8.9%
s 199444
8.8%
S 194544
8.6%
r 192613
 
8.5%
d 185782
 
8.2%
m 101653
 
4.5%
y 101653
 
4.5%
D 101653
 
4.5%
- 101653
 
4.5%
Other values (5) 484948
21.5%

Payment_Method
Categorical

Distinct 4
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.0 MiB
Credit Card
87889 
Debit Card
74847 
Cash
72000 
PayPal
59530 

Length

Max length 11
Median length 10
Mean length 8.0214126
Min length 4

Characters and Unicode

Total characters 2360429
Distinct characters 15
Distinct categories 3 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Debit Card
2nd row Credit Card
3rd row Credit Card
4th row PayPal
5th row Cash

Common Values

Value Count Frequency (%)
Credit Card 87889
29.9%
Debit Card 74847
25.4%
Cash 72000
24.5%
PayPal 59530
20.2%

Length

2025-09-11T18:14:02.374660 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:14:02.525643 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
card 162736
35.6%
credit 87889
19.2%
debit 74847
16.4%
cash 72000
15.8%
paypal 59530
 
13.0%

Most occurring characters

Value Count Frequency (%)
a 353796
15.0%
C 322625
13.7%
r 250625
10.6%
d 250625
10.6%
e 162736
6.9%
t 162736
6.9%
i 162736
6.9%
162736
6.9%
P 119060
 
5.0%
b 74847
 
3.2%
Other values (5) 337907
14.3%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 1681161
71.2%
Uppercase Letter 516532
 
21.9%
Space Separator 162736
 
6.9%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
a 353796
21.0%
r 250625
14.9%
d 250625
14.9%
e 162736
9.7%
t 162736
9.7%
i 162736
9.7%
b 74847
 
4.5%
s 72000
 
4.3%
h 72000
 
4.3%
y 59530
 
3.5%
Uppercase Letter
Value Count Frequency (%)
C 322625
62.5%
P 119060
 
23.0%
D 74847
 
14.5%
Space Separator
Value Count Frequency (%)
162736
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 2197693
93.1%
Common 162736
 
6.9%

Most frequent character per script

Latin
Value Count Frequency (%)
a 353796
16.1%
C 322625
14.7%
r 250625
11.4%
d 250625
11.4%
e 162736
7.4%
t 162736
7.4%
i 162736
7.4%
P 119060
 
5.4%
b 74847
 
3.4%
D 74847
 
3.4%
Other values (4) 263060
12.0%
Common
Value Count Frequency (%)
162736
100.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 2360429
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
a 353796
15.0%
C 322625
13.7%
r 250625
10.6%
d 250625
10.6%
e 162736
6.9%
t 162736
6.9%
i 162736
6.9%
162736
6.9%
P 119060
 
5.0%
b 74847
 
3.2%
Other values (5) 337907
14.3%

Order_Status
Categorical

Distinct 4
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.1 MiB
Delivered
127393 
Shipped
63348 
Processing
55720 
Pending
47805 

Length

Max length 10
Median length 9
Mean length 8.4338931
Min length 7

Characters and Unicode

Total characters 2481808
Distinct characters 16
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Shipped
2nd row Processing
3rd row Processing
4th row Processing
5th row Shipped

Common Values

Value Count Frequency (%)
Delivered 127393
43.3%
Shipped 63348
21.5%
Processing 55720
18.9%
Pending 47805
 
16.2%

Length

2025-09-11T18:14:02.694981 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:14:02.844973 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
delivered 127393
43.3%
shipped 63348
21.5%
processing 55720
18.9%
pending 47805
 
16.2%

Most occurring characters

Value Count Frequency (%)
e 549052
22.1%
i 294266
11.9%
d 238546
9.6%
r 183113
 
7.4%
n 151330
 
6.1%
l 127393
 
5.1%
v 127393
 
5.1%
D 127393
 
5.1%
p 126696
 
5.1%
s 111440
 
4.5%
Other values (6) 445186
17.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 2187542
88.1%
Uppercase Letter 294266
 
11.9%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 549052
25.1%
i 294266
13.5%
d 238546
10.9%
r 183113
 
8.4%
n 151330
 
6.9%
l 127393
 
5.8%
v 127393
 
5.8%
p 126696
 
5.8%
s 111440
 
5.1%
g 103525
 
4.7%
Other values (3) 174788
 
8.0%
Uppercase Letter
Value Count Frequency (%)
D 127393
43.3%
P 103525
35.2%
S 63348
21.5%

Most occurring scripts

Value Count Frequency (%)
Latin 2481808
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 549052
22.1%
i 294266
11.9%
d 238546
9.6%
r 183113
 
7.4%
n 151330
 
6.1%
l 127393
 
5.1%
v 127393
 
5.1%
D 127393
 
5.1%
p 126696
 
5.1%
s 111440
 
4.5%
Other values (6) 445186
17.9%

Most occurring blocks

Value Count Frequency (%)
ASCII 2481808
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 549052
22.1%
i 294266
11.9%
d 238546
9.6%
r 183113
 
7.4%
n 151330
 
6.1%
l 127393
 
5.1%
v 127393
 
5.1%
D 127393
 
5.1%
p 126696
 
5.1%
s 111440
 
4.5%
Other values (6) 445186
17.9%

Ratings
Categorical

High correlation 

Distinct 5
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 14.6 MiB
4.0
95594 
2.0
61094 
5.0
48868 
3.0
46484 
1.0
42226 

Length

Max length 3
Median length 3
Mean length 3
Min length 3

Characters and Unicode

Total characters 882798
Distinct characters 7
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 5.0
2nd row 4.0
3rd row 2.0
4th row 4.0
5th row 1.0

Common Values

Value Count Frequency (%)
4.0 95594
32.5%
2.0 61094
20.8%
5.0 48868
16.6%
3.0 46484
15.8%
1.0 42226
14.3%

Length

2025-09-11T18:14:03.015067 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-11T18:14:03.179970 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Value Count Frequency (%)
4.0 95594
32.5%
2.0 61094
20.8%
5.0 48868
16.6%
3.0 46484
15.8%
1.0 42226
14.3%

Most occurring characters

Value Count Frequency (%)
. 294266
33.3%
0 294266
33.3%
4 95594
 
10.8%
2 61094
 
6.9%
5 48868
 
5.5%
3 46484
 
5.3%
1 42226
 
4.8%

Most occurring categories

Value Count Frequency (%)
Decimal Number 588532
66.7%
Other Punctuation 294266
33.3%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 294266
50.0%
4 95594
 
16.2%
2 61094
 
10.4%
5 48868
 
8.3%
3 46484
 
7.9%
1 42226
 
7.2%
Other Punctuation
Value Count Frequency (%)
. 294266
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 882798
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
. 294266
33.3%
0 294266
33.3%
4 95594
 
10.8%
2 61094
 
6.9%
5 48868
 
5.5%
3 46484
 
5.3%
1 42226
 
4.8%

Most occurring blocks

Value Count Frequency (%)
ASCII 882798
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
. 294266
33.3%
0 294266
33.3%
4 95594
 
10.8%
2 61094
 
6.9%
5 48868
 
5.5%
3 46484
 
5.3%
1 42226
 
4.8%

products
Text

Distinct 318
Distinct (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory size 16.9 MiB
2025-09-11T18:14:03.582900 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Length

Max length 28
Median length 22
Mean length 11.156522
Min length 3

Characters and Unicode

Total characters 3282985
Distinct characters 54
Distinct categories 5 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Cycling shorts
2nd row Lenovo Tab
3rd row Sports equipment
4th row Utility knife
5th row Chocolate cookies
Value Count Frequency (%)
water 23872
 
4.7%
tv 13001
 
2.6%
juice 11932
 
2.4%
tee 10751
 
2.1%
ac 8150
 
1.6%
refrigerator 7188
 
1.4%
jeans 5975
 
1.2%
dress 5952
 
1.2%
headphones 5921
 
1.2%
fiction 5806
 
1.2%
Other values (371) 405068
80.4%
2025-09-11T18:14:04.404092 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
e 366048
 
11.1%
r 249555
 
7.6%
a 232474
 
7.1%
o 212932
 
6.5%
209350
 
6.4%
t 202877
 
6.2%
i 200979
 
6.1%
s 172132
 
5.2%
n 148906
 
4.5%
l 148039
 
4.5%
Other values (44) 1139693
34.7%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 2655099
80.9%
Uppercase Letter 395208
 
12.0%
Space Separator 209350
 
6.4%
Dash Punctuation 22160
 
0.7%
Decimal Number 1168
 
< 0.1%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 366048
13.8%
r 249555
 
9.4%
a 232474
 
8.8%
o 212932
 
8.0%
t 202877
 
7.6%
i 200979
 
7.6%
s 172132
 
6.5%
n 148906
 
5.6%
l 148039
 
5.6%
c 96798
 
3.6%
Other values (16) 624359
23.5%
Uppercase Letter
Value Count Frequency (%)
S 44404
11.2%
C 43907
11.1%
T 34260
 
8.7%
P 33087
 
8.4%
A 27223
 
6.9%
M 25645
 
6.5%
D 23851
 
6.0%
B 23294
 
5.9%
G 18969
 
4.8%
L 18206
 
4.6%
Other values (15) 102362
25.9%
Space Separator
Value Count Frequency (%)
209350
100.0%
Dash Punctuation
Value Count Frequency (%)
- 22160
100.0%
Decimal Number
Value Count Frequency (%)
4 1168
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 3050307
92.9%
Common 232678
 
7.1%

Most frequent character per script

Latin
Value Count Frequency (%)
e 366048
 
12.0%
r 249555
 
8.2%
a 232474
 
7.6%
o 212932
 
7.0%
t 202877
 
6.7%
i 200979
 
6.6%
s 172132
 
5.6%
n 148906
 
4.9%
l 148039
 
4.9%
c 96798
 
3.2%
Other values (41) 1019567
33.4%
Common
Value Count Frequency (%)
209350
90.0%
- 22160
 
9.5%
4 1168
 
0.5%

Most occurring blocks

Value Count Frequency (%)
ASCII 3282985
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 366048
 
11.1%
r 249555
 
7.6%
a 232474
 
7.1%
o 212932
 
6.5%
209350
 
6.4%
t 202877
 
6.2%
i 200979
 
6.1%
s 172132
 
5.2%
n 148906
 
4.5%
l 148039
 
4.5%
Other values (44) 1139693
34.7%

Interactions

2025-09-11T18:13:46.177033 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:38.620430 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:39.969825 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:41.194079 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:42.414400 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:43.659184 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:44.965138 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:46.364227 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:38.835001 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:40.140807 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:41.359804 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:42.594759 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:43.892065 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:45.142229 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:46.555731 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:39.022254 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:40.303003 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:41.524049 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:42.800838 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:44.058388 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:45.312546 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:46.752735 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:39.217928 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:40.464041 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:41.697651 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:42.974738 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:44.228918 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:45.499551 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:46.996459 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:39.382114 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:40.635552 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:41.861449 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:43.137747 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:44.390542 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:45.664162 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:47.174512 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:39.553851 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:40.814098 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:42.032890 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:43.308728 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:44.552527 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:45.837044 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:47.345504 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:39.773834 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:40.990092 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:42.231779 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:43.479594 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:44.764259 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
2025-09-11T18:13:46.007045 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-09-11T18:14:04.593094 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Age Amount Country Customer_ID Customer_Segment Feedback Gender Income Month Order_Status Payment_Method Product_Brand Product_Category Product_Type Ratings Shipping_Method Total_Amount Total_Purchases Transaction_ID Year Zipcode
Age 1.000 0.001 0.149 0.000 0.221 0.221 0.112 0.140 0.157 0.190 0.056 0.182 0.121 0.190 0.158 0.019 0.028 0.043 -0.001 0.000 -0.000
Amount 0.001 1.000 0.000 0.000 0.000 0.003 0.000 0.002 0.000 0.000 0.000 0.000 0.000 0.000 0.003 0.006 0.700 0.000 -0.003 0.004 0.001
Country 0.149 0.000 1.000 0.002 0.082 0.054 0.019 0.166 0.076 0.040 0.021 0.071 0.044 0.077 0.039 0.007 0.000 0.003 0.000 0.000 0.033
Customer_ID 0.000 0.000 0.002 1.000 0.002 0.000 0.000 0.001 0.000 0.004 0.000 0.000 0.000 0.002 0.000 0.000 0.003 0.004 0.005 0.000 -0.002
Customer_Segment 0.221 0.000 0.082 0.002 1.000 0.078 0.029 0.119 0.137 0.125 0.033 0.102 0.025 0.103 0.075 0.008 0.000 0.004 0.004 0.000 0.000
Feedback 0.221 0.003 0.054 0.000 0.078 1.000 0.038 0.049 0.128 0.057 0.064 0.196 0.077 0.203 0.913 0.018 0.002 0.001 0.000 0.000 0.002
Gender 0.112 0.000 0.019 0.000 0.029 0.038 1.000 0.025 0.103 0.051 0.007 0.124 0.036 0.124 0.027 0.002 0.000 0.000 0.003 0.000 0.001
Income 0.140 0.002 0.166 0.001 0.119 0.049 0.025 1.000 0.131 0.047 0.018 0.033 0.012 0.035 0.048 0.005 0.002 0.000 0.004 0.000 0.000
Month 0.157 0.000 0.076 0.000 0.137 0.128 0.103 0.131 1.000 0.114 0.078 0.182 0.088 0.182 0.091 0.040 0.000 0.000 0.002 0.743 0.003
Order_Status 0.190 0.000 0.040 0.004 0.125 0.057 0.051 0.047 0.114 1.000 0.018 0.158 0.075 0.159 0.047 0.014 0.000 0.002 0.003 0.002 0.000
Payment_Method 0.056 0.000 0.021 0.000 0.033 0.064 0.007 0.018 0.078 0.018 1.000 0.108 0.029 0.108 0.055 0.024 0.000 0.003 0.000 0.001 0.002
Product_Brand 0.182 0.000 0.071 0.000 0.102 0.196 0.124 0.033 0.182 0.158 0.108 1.000 1.000 0.842 0.139 0.093 0.002 0.000 0.000 0.004 0.001
Product_Category 0.121 0.000 0.044 0.000 0.025 0.077 0.036 0.012 0.088 0.075 0.029 1.000 1.000 1.000 0.056 0.029 0.003 0.000 0.001 0.003 0.000
Product_Type 0.190 0.000 0.077 0.002 0.103 0.203 0.124 0.035 0.182 0.159 0.108 0.842 1.000 1.000 0.144 0.093 0.000 0.001 0.002 0.006 0.000
Ratings 0.158 0.003 0.039 0.000 0.075 0.913 0.027 0.048 0.091 0.047 0.055 0.139 0.056 0.144 1.000 0.016 0.003 0.000 0.000 0.002 0.000
Shipping_Method 0.019 0.006 0.007 0.000 0.008 0.018 0.002 0.005 0.040 0.014 0.024 0.093 0.029 0.093 0.016 1.000 0.000 0.002 0.004 0.000 0.000
Total_Amount 0.028 0.700 0.000 0.003 0.000 0.002 0.000 0.002 0.000 0.000 0.000 0.002 0.003 0.000 0.003 0.000 1.000 0.651 -0.003 0.003 0.001
Total_Purchases 0.043 0.000 0.003 0.004 0.004 0.001 0.000 0.000 0.000 0.002 0.003 0.000 0.000 0.001 0.000 0.002 0.651 1.000 -0.002 0.002 0.001
Transaction_ID -0.001 -0.003 0.000 0.005 0.004 0.000 0.003 0.004 0.002 0.003 0.000 0.000 0.001 0.002 0.000 0.004 -0.003 -0.002 1.000 0.000 -0.003
Year 0.000 0.004 0.000 0.000 0.000 0.000 0.000 0.000 0.743 0.002 0.001 0.004 0.003 0.006 0.002 0.000 0.003 0.002 0.000 1.000 0.005
Zipcode -0.000 0.001 0.033 -0.002 0.000 0.002 0.001 0.000 0.003 0.000 0.002 0.001 0.000 0.000 0.000 0.000 0.001 0.001 -0.003 0.005 1.000

Missing values

2025-09-11T18:13:47.762549 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-09-11T18:13:48.947056 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Transaction_ID Customer_ID Name Email Address City State Zipcode Country Age Gender Income Customer_Segment Date Year Month Time Total_Purchases Amount Total_Amount Product_Category Product_Brand Product_Type Feedback Shipping_Method Payment_Method Order_Status Ratings products
0 8691788.0 37249.0 Michelle Harrington Ebony39@gmail.com 3959 Amanda Burgs Dortmund Berlin 77985.0 Germany 21.0 Male Low Regular 9/18/2023 2023.0 September 22:03:55 3.0 108.028757 324.086270 Clothing Nike Shorts Excellent Same-Day Debit Card Shipped 5.0 Cycling shorts
1 2174773.0 69749.0 Kelsey Hill Mark36@gmail.com 82072 Dawn Centers Nottingham England 99071.0 UK 19.0 Female Low Premium 12/31/2023 2023.0 December 08:42:04 2.0 403.353907 806.707815 Electronics Samsung Tablet Excellent Standard Credit Card Processing 4.0 Lenovo Tab
2 6679610.0 30192.0 Scott Jensen Shane85@gmail.com 4133 Young Canyon Geelong New South Wales 75929.0 Australia 48.0 Male Low Regular 4/26/2023 2023.0 April 04:06:29 3.0 354.477600 1063.432799 Books Penguin Books Children's Average Same-Day Credit Card Processing 2.0 Sports equipment
3 7232460.0 62101.0 Joseph Miller Mary34@gmail.com 8148 Thomas Creek Suite 100 Edmonton Ontario 88420.0 Canada 56.0 Male High Premium 05-08-23 2023.0 May 14:55:17 7.0 352.407717 2466.854021 Home Decor Home Depot Tools Excellent Standard PayPal Processing 4.0 Utility knife
4 4983775.0 27901.0 Debra Coleman Charles30@gmail.com 5813 Lori Ports Suite 269 Bristol England 48704.0 UK 22.0 Male Low Premium 01-10-24 2024.0 January 16:54:07 2.0 124.276524 248.553049 Grocery Nestle Chocolate Bad Standard Cash Shipped 1.0 Chocolate cookies
5 6095326.0 41289.0 Ryan Johnson Haley12@gmail.com 532 Ashley Crest Suite 014 Brisbane New South Wales 74430.0 Australia 58.0 Female Medium Premium 9/21/2023 2023.0 September 23:24:27 4.0 296.291806 1185.167224 Electronics Apple Tablet Good Express PayPal Pending 4.0 Lenovo Tab
6 5434096.0 97285.0 Erin Lewis Arthur76@gmail.com 600 Brian Prairie Suite 497 Kitchener Ontario 47545.0 Canada 29.0 Female Low New 6/26/2023 2023.0 June 13:35:51 2.0 315.057648 630.115295 Electronics Samsung Television Bad Standard Cash Processing 1.0 QLED TV
7 2344675.0 26603.0 Angela Fields Tanya94@gmail.com 237 Young Curve Munich Berlin 86862.0 Germany 29.0 Male Medium Premium 3/24/2023 2023.0 March 10:12:56 1.0 46.588070 46.588070 Clothing Zara Shirt Bad Same-Day Cash Processing 1.0 Dress shirt
8 4155845.0 80175.0 Diane Clark Martin39@gmail.com 8823 Mariah Heights Apt. 263 Wollongong New South Wales 39820.0 Australia 46.0 Male Medium New 01-06-24 2024.0 January 14:38:26 8.0 328.839302 2630.714413 Grocery Nestle Chocolate Bad Same-Day Cash Delivered 1.0 Dark chocolate
9 4926148.0 31878.0 Lori Bell Jessica33@gmail.com 6225 William Lodge Cologne Berlin 64317.0 Germany 25.0 Male Medium New 10-04-23 2023.0 October 22:27:40 10.0 397.611229 3976.112295 Home Decor Home Depot Decorations Excellent Standard Cash Delivered 4.0 Candles
Transaction_ID Customer_ID Name Email Address City State Zipcode Country Age Gender Income Customer_Segment Date Year Month Time Total_Purchases Amount Total_Amount Product_Category Product_Brand Product_Type Feedback Shipping_Method Payment_Method Order_Status Ratings products
294256 9862022.0 99656.0 Amber Fields Donald38@gmail.com 50022 Antonio Valley Suite 498 Plymouth England 92925.0 UK 43.0 Male Medium New 03-05-23 2023.0 March 14:47:57 7.0 21.686500 151.805498 Grocery Nestle Chocolate Bad Express Cash Pending 1.0 Chocolate mousse
294257 5561211.0 35974.0 Jordan Hall Holly86@gmail.com 806 Alan Flats Ballarat New South Wales 99633.0 Australia 61.0 Male Medium Premium 2/14/2024 2024.0 February 00:13:52 7.0 343.066709 2401.466964 Home Decor Bed Bath & Beyond Bathroom Bad Same-Day Cash Processing 1.0 Soap dispenser
294258 8961631.0 79479.0 Jason Welch Jason36@gmail.com 764 Garcia Flat Hamilton Ontario 61218.0 Canada 63.0 Male Low New 09-06-23 2023.0 September 17:37:41 6.0 443.329498 2659.976987 Home Decor Home Depot Tools Excellent Express Cash Pending 5.0 Level
294259 2844206.0 18799.0 Angel Hood Joseph24@gmail.com 7593 Joseph Trace Suite 382 Cairns New South Wales 39837.0 Australia 41.0 Male Medium Premium 5/28/2023 2023.0 May 13:11:33 6.0 397.452883 2384.717299 Electronics Apple Tablet Average Same-Day Cash Pending 2.0 Amazon Fire Tablet
294260 4833982.0 94117.0 Kara Hart Tammy37@gmail.com 872 Robinson Harbors Apt. 328 Charlotte Missouri 65301.0 USA 54.0 Male High New 10/14/2023 2023.0 October 15:06:09 5.0 472.424060 2362.120301 Clothing Nike Shorts Excellent Standard Cash Delivered 4.0 Chino shorts
294261 4246475.0 12104.0 Meagan Ellis Courtney60@gmail.com 389 Todd Path Apt. 159 Townsville New South Wales 4567.0 Australia 31.0 Male Medium Regular 1/20/2024 2024.0 January 23:40:29 5.0 194.792597 973.962984 Books Penguin Books Fiction Bad Same-Day Cash Processing 1.0 Historical fiction
294262 1197603.0 69772.0 Mathew Beck Jennifer71@gmail.com 52809 Mark Forges Hanover Berlin 16852.0 Germany 35.0 Female Low New 12/28/2023 2023.0 December 02:55:45 1.0 285.137301 285.137301 Electronics Apple Laptop Excellent Same-Day Cash Processing 5.0 LG Gram
294263 7743242.0 28449.0 Daniel Lee Christopher100@gmail.com 407 Aaron Crossing Suite 495 Brighton England 88038.0 UK 41.0 Male Low Premium 2/27/2024 2024.0 February 02:43:49 3.0 60.701761 182.105285 Clothing Adidas Jacket Average Express Cash Shipped 2.0 Parka
294264 9301950.0 45477.0 Patrick Wilson Rebecca65@gmail.com 3204 Baird Port Halifax Ontario 67608.0 Canada 41.0 Male Medium New 09-03-23 2023.0 September 11:20:31 1.0 120.834784 120.834784 Home Decor IKEA Furniture Good Standard Cash Shipped 4.0 TV stand
294265 2882826.0 53626.0 Dustin Merritt William14@gmail.com 143 Amanda Crescent Tucson West Virginia 25242.0 USA 28.0 Female Low Premium 01-08-24 2024.0 January 11:44:36 7.0 340.319059 2382.233417 Home Decor Home Depot Decorations Average Same-Day Cash Shipped 2.0 Clocks

Duplicate rows

Most frequently occurring

Transaction_ID Customer_ID Name Email Address City State Zipcode Country Age Gender Income Customer_Segment Date Year Month Time Total_Purchases Amount Total_Amount Product_Category Product_Brand Product_Type Feedback Shipping_Method Payment_Method Order_Status Ratings products # duplicates
0 3200766.0 49598.0 Mikayla Mueller Kenneth43@gmail.com 716 Joshua Rapids Apt. 790 Bremen Berlin 64747.0 Germany 59.0 Male Low New 11/23/2023 2023.0 November 15:35:49 9.0 272.327418 2450.946762 Grocery Pepsi Soft Drink Bad Standard Cash Pending 1.0 Iced tea 2
1 4476510.0 20103.0 Christine Kim James11@gmail.com 8176 Randy Squares Apt. 772 Kitchener Ontario 7099.0 Canada 54.0 Female Low Regular 12-10-23 2023.0 December 01:09:09 7.0 134.374182 940.619277 Clothing Adidas T-shirt Bad Express Credit Card Processing 1.0 Off-the-shoulder tee 2
2 4942326.0 25416.0 Pamela Martin Christine83@gmail.com 9851 Myers Tunnel Leicester England 57655.0 UK 63.0 Male Low Regular 05-05-23 2023.0 May 04:17:21 8.0 191.906886 1535.255087 Clothing Adidas Jacket Average Same-Day Cash Pending 2.0 Varsity jacket 2