logo
Tags down

shadow

Read specific information from nested json in python pandas


By : user6021932
Date : October 16 2020, 06:10 AM
I think the issue was by ths following , I have a nested json file which I read with python. I am interested only in specfic information. Suppose I have following json file , I would just build the data by hand:
code :
with open(r"filepath") as data_file:
    data=json.load(data_file)["kistler"]["actualValues"]

df = pd.DataFrame({k: [v["value"]] for k,v in data.items()})
   Zykluszaehler  Zykluszeit  Parameter 1  Parameter 2
0          196.0      5082.0          0.0          0.0


Share : facebook icon twitter icon

Pandas read nested json


By : nishan
Date : March 29 2020, 07:55 AM
wish of those help You can use json_normalize:
code :
import json
from pandas.io.json import json_normalize    

with open('myJson.json') as data_file:    
    data = json.load(data_file)  

df = json_normalize(data, 'locations', ['date', 'number', 'name'], 
                    record_prefix='locations_')
print (df)
  locations_arrTime locations_arrTimeDiffMin locations_depTime  \
0                                                        06:32   
1             06:37                        1             06:40   
2             08:24                        1                     

  locations_depTimeDiffMin           locations_name locations_platform  \
0                        0  Spital am Pyhrn Bahnhof                  2   
1                        0  Windischgarsten Bahnhof                  2   
2                                    Linz/Donau Hbf               1A-B   

  locations_stationIdx locations_track number    name        date  
0                    0          R 3932         R 3932  01.10.2016  
1                    1                         R 3932  01.10.2016  
2                   22                         R 3932  01.10.2016 
df = pd.read_json("myJson.json")
df.locations = pd.DataFrame(df.locations.values.tolist())['name']
df = df.groupby(['date','name','number'])['locations'].apply(','.join).reset_index()
print (df)
        date    name number                                          locations
0 2016-01-10  R 3932         Spital am Pyhrn Bahnhof,Windischgarsten Bahnho... 

Pandas read nested json with NaN entries


By : Marc Clifton
Date : March 29 2020, 07:55 AM
this will help I am trying to read a json that has nested dictionaries by following this pandas tutorial, the problem is some of my nested list/dictionaries are NaN so if I try calling the normalize function I get a can't find Key Error since it only exists for certain elements in the higher level of the dictionary. , IIUC:
code :
In [94]: (json_normalize([x for x in q if x.get('Summary')],
                         'Summary',
                         ['Code', 'datetime'])
    ...:                .append(pd.DataFrame([x for x in q if not x.get('Summary')])))
    ...:
Out[94]:
  Code angel                               datetime  devil     num  prod
0   IO   NaN     2013-10-14T16:30-05:00[US/Eastern]   17.0   81.04  book
1   IO   NaN     2013-10-14T16:30-05:00[US/Eastern]   10.0  191.50  game
2   IO   NaN     2013-10-14T16:30-05:00[US/Eastern]   -6.0   55.50  desk
3   IO   ipo     2013-10-14T16:30-05:00[US/Eastern]    1.0  503.00   NaN
4   WX   yut  2017-10-13T05:00:02-05:00[US/Eastern]    0.0    0.00  read
5   WX   fgf  2017-10-13T05:00:02-05:00[US/Eastern]    0.0     NaN  fart
6   WX  deft  2017-10-13T05:00:02-05:00[US/Eastern]    0.0  673.00   red
7   WX   NaN  2017-10-13T05:00:02-05:00[US/Eastern]    0.0     NaN   dog
8   WX   hut  2017-10-13T05:00:02-05:00[US/Eastern]   99.0     NaN   NaN
0   GE   NaN  2011-11-14T19:30:03-05:00[US/Eastern]    NaN     NaN   NaN
1   PP   NaN     2012-21-14T18:50-05:00[US/Eastern]    NaN     NaN   NaN
2   BI   NaN     2014-11-14T12:30-05:00[US/Eastern]    NaN     NaN   NaN
3   EZ   NaN     2015-12-14T10:00-05:00[US/Eastern]    NaN     NaN   NaN
4   JC   NaN  2016-10-14T08:30:01-05:00[US/Eastern]    NaN     NaN   NaN
In [95]: pd.concat([json_normalize([x for x in q if x.get('Summary')],
    ...:                           'Summary',
    ...:                           ['Code', 'datetime']),
    ...:            pd.DataFrame([x for x in q if not x.get('Summary')])],
    ...:           ignore_index=True)
    ...:
Out[95]:
   Code angel                               datetime  devil     num  prod
0    IO   NaN     2013-10-14T16:30-05:00[US/Eastern]   17.0   81.04  book
1    IO   NaN     2013-10-14T16:30-05:00[US/Eastern]   10.0  191.50  game
2    IO   NaN     2013-10-14T16:30-05:00[US/Eastern]   -6.0   55.50  desk
3    IO   ipo     2013-10-14T16:30-05:00[US/Eastern]    1.0  503.00   NaN
4    WX   yut  2017-10-13T05:00:02-05:00[US/Eastern]    0.0    0.00  read
5    WX   fgf  2017-10-13T05:00:02-05:00[US/Eastern]    0.0     NaN  fart
6    WX  deft  2017-10-13T05:00:02-05:00[US/Eastern]    0.0  673.00   red
7    WX   NaN  2017-10-13T05:00:02-05:00[US/Eastern]    0.0     NaN   dog
8    WX   hut  2017-10-13T05:00:02-05:00[US/Eastern]   99.0     NaN   NaN
9    GE   NaN  2011-11-14T19:30:03-05:00[US/Eastern]    NaN     NaN   NaN
10   PP   NaN     2012-21-14T18:50-05:00[US/Eastern]    NaN     NaN   NaN
11   BI   NaN     2014-11-14T12:30-05:00[US/Eastern]    NaN     NaN   NaN
12   EZ   NaN     2015-12-14T10:00-05:00[US/Eastern]    NaN     NaN   NaN
13   JC   NaN  2016-10-14T08:30:01-05:00[US/Eastern]    NaN     NaN   NaN

Automate the read and save of several json files (with different information) to different pandas dataframes


By : christine thompson
Date : March 29 2020, 07:55 AM
I wish this help you Assuming that your files are named following the same logic I would do the following:
code :
files = ['f_fruit.json','f_clothes.json','f_games.json'] #you can use os.walk to get a list of files from a specific folder

for file_name in files:
    col_name = file_name.split('.')[0][2:]
    with open(file_name, 'r') as f:
        data = json.load(f)
    var_name = 'df_{}'.format(col_name)
    globals()[var_name] = pd.DataFrame(data[col_name])
>>> col_name = 'fruit'
>>> var_name = 'df_{}'.format(col_name)
>>> globals()[var_name] = 'some value'
>>> df_fruit
'some value'

how to read nested json file in pandas dataframe?


By : user2500948
Date : March 29 2020, 07:55 AM
Hope this helps I learned how to load and read json file in pandas dataframe. However, I have multiple json files about news and each json file hold a rather complicated nested structure to represent news content and its metadata. I need to read them in pandas dataframe for next downstream analysis. So I figured out how to load and read json file in python. However, the solution that I learned for my json file doesn't work for me. Here is example json data snippet on the fly: example json file and here is what I tried:
code :
import os
import glob
import json

from pandas.io.json import json_normalize

path_to_json = 'FakeNewsNetData/BuzzFeed/FakeNewsContent/'
json_paths = glob.glob(os.path.join(path_to_json, "*.json"))
df = pd.concat((json_normalize(json.load(open(p))) for p in json_paths), axis=0)
df = df.reset_index(drop=True)  # Optionally reset index.
# This will be a set of entries you wish to remove.
# Here we only consider "View All Posts".
invalid_entries = {"View All Posts"}

import functools
def fix(x, invalid):
    if isinstance(x, list):
        return [i for i in x if i not in invalid]
    else:
        # You can optionally choose to return [] here to fix the NaNs
        # and to standardize the types of the values in this column
        return x

fix_author = functools.partial(fix, invalid=invalid_entries)
df["authors"] = df.authors.apply(fix_author)

How to parse specific parts of nested JSON format into csv in python (pandas)


By : user3611110
Date : March 29 2020, 07:55 AM
wish help you to fix your issue You can do something like this. Since you didn't provide an example output I did something on my own.
code :
import json
import csv

f = open(r'file.txt')
data = json.load(f)
f.close()
with open("output.csv", mode="w", newline='') as out:
    w = csv.writer(out)
    header = ["id","name","path","tags","points"]
    w.writerow(header)
    for asset in data["assets"]:
        data_point = data["assets"][asset]
        output = [data_point["asset"]["id"]]
        output.append(data_point["asset"]["name"])
        output.append(data_point["asset"]["path"])
        output.append(data_point["regions"][0]["tags"])
        output.append(data_point["regions"][0]["points"])
        w.writerow(output)
id,name,path,tags,points
0b8f6f214dc7066b00b50ae16cf25cf6,1.jpg,c:\temp\1.jpg,"['3', '9', 'Dark Poor']","[{'x': 167.41071428571428, 'y': 252.02922077922076}, {'x': 208.80681818181816, 'y': 891.2337662337662}, {'x': 1252.232142857143, 'y': 936.2824675324675}, {'x': 1279.017857142857, 'y': 241.07142857142856}]"
0155d8143c8cad85b5b9d392fd2895a4,2.jpg,c:\temp\2.jpg,['Dark Poor'],"[{'x': 152.39423076923077, 'y': 311.68831168831167}, {'x': 144.08653846153848, 'y': 802.077922077922}, {'x': 964.4711538461539, 'y': 781.2987012987012}, {'x': 935.3942307692308, 'y': 299.2207792207792}]"

Related Posts Related Posts :
  • Setting debug = False makes the Django app crash with the following error, how to fix it?
  • How to get the average of many lists embedded within each other?
  • Paramiko with subprocess
  • 2D table conversion for example: y = f(x1,x2) => x1 = f(y, x2)
  • Return a literal string of a tuple in python
  • How to split a Column when you have same values?
  • How to perform str.strip in dataframe and save it with inplace=true?
  • why zip(*k) can't work when k is a iterator?
  • How to get list as an input from command line python?
  • Is Tensorflow Dataset.from_generator deprecated in tensorflow 2.0 ? It throws tf.py_func deprecation error
  • Loop as long as input is greater then previous input
  • How to combine 2 rows based on different column values
  • Extracting 3 levels deep product details. Getting error NameError: name 'item' is not defined
  • How do I get the default fill values?
  • How to convert single list's elements in form of dictionary
  • Search a user given number inside a list using for loop
  • How to extract a particular value from this data structure?
  • How to save a df into two excel files in multiple locations?
  • How to get the sum of a field in Django
  • i+ =1 generating a Syntax error in for loop
  • Lookup if Dictionary key contains items in Python
  • How to comma separate an array of integers in python?
  • Extract rows from pandas dataframe corresponding to list of month-day
  • Reading formatted array from file in Python
  • Python Error: can't install scipy.optimize.brentq
  • Why isn't my gradient descent algorithm working?
  • How to find a 'str' in a 2-D array and return element in next column?
  • Code not outputting a value in hackerrank
  • Fibonacci sequence calculator seems correct but can't find similar code online. Is there something wrong?
  • Can't call attribute of class within the class itself in Python 3.6.5
  • How to make a loop in dictionary to extract values?
  • Is there a way of aggregating rows without summing up their results?
  • I am having a problem with understanding this python code
  • Stop number decrease once 0 reached on dice game - Python
  • Is possible to make a binary search by searching between unknown values?
  • pass object method as function argument for method chaining in python
  • pylint W0622 (Redefining built-in) when overriding "standard" methods in subclasses
  • Extract values from String using Python
  • How do I get a bytearray from a Tkinter entry widget
  • Function not outputing a value in Python
  • Object of type date is not JSON serializable error, while uploading dataframe to bigquery?
  • RegEx for matching specific element of HTML
  • How to initiate widgets through tk/tcl
  • urlparse does not raise exception for an invalid url
  • plot stacked percentage barchart matplotlib
  • How to have the .isupper() and .islower() methods in one line of code?
  • Removing header index from dataframe
  • how to input all data first, then give all output in python?
  • Hot to fix Tensorflow model not running in Eager mode with .fit()?
  • Proxy configuration in Scrapy
  • If/else statement within loop over dataframe
  • I have a code or stop the loop, I do not know how I can do for what stops
  • python pandas : lambda or other method to count NaN values / len(value)<1 along rows
  • Combine two dataframes with same values in several columns
  • Replace Iterations by elegant Pandas code
  • If all elements match requirement not using "if all"
  • Access to 3D array in fragment shader
  • How to normalize the columns of a DataFrame using sklearn.preprocessing.normalize?
  • Validation loss not moving with MLP in Regression
  • ML with imbalanced binary dataset
  • shadow
    Privacy Policy - Terms - Contact Us © 35dp-dentalpractice.co.uk