Feature Request: set_record + set_records #677

thp44 · 2019-05-02T13:43:29Z

get_all_records()
return nicely formatted dictionaries (thanks!)

set_record() (new feature)
inset one dictionary to google sheet (values under right heading)

set_records() (new feature)
inset list of dictionaries to google sheet (values under right heading)

muddi900 · 2024-06-30T17:21:31Z

This would be simple to implement, but it would be very slow.

We can add a mapping of the columns when the get_all_records are generated to the index.

Most of the work would be validation and testing.

alifeee · 2024-07-01T00:45:53Z

The way I could see this feature working is similar to Python's CSV DictWriter class

def append_records(
  headers: List[str],
  records: List[Dict[str, str]],
  default_blank: Any = None,
  ignore_extra_dict_items: bool = False,
  # extra kwargs?
)
  # use a provided "headers" array, probably obtained with worksheet.row_values(1)
  # new_rows = []
  # for each object in records
    # for each header in headers
      # if header not in headers, raise error! (unless kwarg)
      # if header not in object, raise error! (unless default blank)
      # new_rows[i].append(object[header])
  # use worksheet.append_rows to add new_rows to worksheet

and append_record (singular) would call append_records with a wrapped record, like [record] (like append_row and appeend_rows)

Questions I would have:

should we provide value input option kwargs?
should we only offer starting from column 1? (probably easiest)

This would be simple to implement, but it would be very slow.

slow? not sure what you mean

We can add a mapping of the columns when the get_all_records are generated to the index.

also not sure what you mean

if you desire to implement it, I think this could be a nice feature @muddi900 :)

muddi900 · 2024-07-01T12:28:07Z

It would be slow in the sense that the order may not be correct each time. So we have to make sure the order is correct. Or use memoization: create an list of None the size of the record, loop through the whole record each time and fill the appropriate value according column mapping.

alifeee · 2024-07-02T09:15:41Z

not sure what you mean? dictionaries do not have an order as I understand it (or should not)

So I would see the list being created with something like

default_blank = None
ignore_extra_headers = False
data = {"fruit": "Apple", "tastiness": "very tasty"}
headers = ["fruit", "tastiness"]

new_row = []
if not set(data).issubset(set(headers)) and ignore_extra_headers == False:
  raise ValueError("Extra headers found in data")
for key in headers:
  if key not in data and default_blank is None:
    raise ValueError(f"No key <{key}> found in data")
  else:
    new_row.append(default_blank)
    continue
  new_row.append(data[key])
# new_row is ["Apple", "very tasty"]

lavigne958 · 2024-07-03T22:40:37Z

I agree with your code @alifeee , small detail here, remember that data is List[Dict[str,str]] as we have a dict per row.
it looks good and the slowness here is depending on the length of data, with 1M rows of course it would take some time to run. Though I don't think a sheet can hold 1M rows 🤔 (may be it does ? )

so far we need to wrap the above code in a for loop iterating on data and run that for each dict found in the list data.
it takes sometime to run if we have a lot of rows in data but nothing to long I believe.

alifeee · 2024-07-05T14:02:03Z

yes. my example, effectively, was an implementation of set_record (singular data), instead of set_records (plural data)

I think slowness will not be a problem. We will see if it is after making the code.

lavigne958 · 2024-07-05T22:05:28Z

alright then, let's put it in the next milestone.

muddi900 · 2024-07-17T11:10:38Z

Is there an efficient way to getting the last row of a data range?

Because we would have to call get_all_values every time with set_records

lavigne958 · 2024-07-17T21:46:28Z

Is there an efficient way to getting the last row of a data range?

if you already know the range yes, if you don't know the range, like getting the last row in the worksheet without knowing the height of the worksheet -> then no you don't.

Because we would have to call get_all_values every time with set_records

not necessarily, depends what we want the it to do 🙃
from what I understand:

the method takes a dict value
- the keys are: the header name
- the values are: the value to append below each header
we need to read the 1st row so we know which column contains which header
then we push the data starting from row 2.

This can be improved by adding an extra argument:

starting_row: the row where we should start to append the data

or this can be done using the API appendRow which allows us to pass a row of data, the API will handle the "last row" problem for us and append the row bellow: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/append

this might be the easiest solution, though the extra input starting_row allow a user to choose where to put the data too.

muddi900 · 2024-07-19T11:30:53Z

If a person is using set_records, they are probably appending the data range, rather than the creating the first row.

I have implemented a basic solution locally.

def set_records(self, rows: List[Dict[str, Any]]) -> Response:
        cols = self.column_headers
        insert_rows = []
        for row in rows:
            insert_row = []
            for col in cols:
                insert_row.append(row[col])
            insert_rows.append(insert_row)

        url = SPREADSHEET_VALUES_APPEND_URL % (
            self.spreadsheet_id,
            f"'{self.title}'!A1",
        )
        resp = self.append_rows(
            insert_rows,
            value_input_option=ValueInputOption.user_entered,
        )

        return resp

    def set_record(self, row: Dict[str, Any]) -> None:
        self.set_records([row])

I also implemented a getter/setter for 'column_headers'.

 if client is None or not isinstance(client, HTTPClient):
            raise RuntimeError(
                """Missing HTTP Client, it must be provided with a
                valid instance of type gspread.http_client.HTTPClient .
                Please allocate new Worksheet object using method like:
                spreadsheet.get_worksheet(0)
                """
            )

        self.spreadsheet_id = spreadsheet_id
        self.client = client
        self._properties = properties

        # kept for backward compatibility - publicly available
        # do not use if possible.
        self._spreadsheet = spreadsheet
+        self._column_headers = []
+
+    @property
+    def column_headers(self) -> List[str]:
+        if not self._column_headers:
+            self._column_headers = self.row_values(1)
+        return self._column_headers
+
+    @column_headers.setter
+    def column_headers(self, value: List[str]) -> None:
+        self._column_headers = value

# ...

def get_all_records(
        self,
        head: int = 1,
        expected_headers: Optional[List[str]] = None,
        value_render_option: Optional[ValueRenderOption] = None,
        default_blank: str = "",
        numericise_ignore: Iterable[Union[str, int]] = [],
        allow_underscores_in_numeric_literals: bool = False,
        empty2zero: bool = False,
    ) -> List[Dict[str, Union[int, float, str]]]:
        """Returns a list of dictionaries, all of them having the contents of
        the spreadsheet with the head row as keys and each of these
        dictionaries holding the contents of subsequent rows of cells as
        values.

        This method uses the function :func:`gspread.utils.to_records` to build the resulting
        records. It mainly wraps around the function and handle the simplest use case
        using a header row (default = 1) and the the reste of the entire sheet.

        .. note::

           for any particular use-case, please get your dataset, your headers
           then use the function :func:`gspread.utils.to_records` to build the records.

        Cell values are numericised (strings that can be read as ints or floats
        are converted), unless specified in numericise_ignore

        :param int head: (optional) Determines which row to use as keys,
            starting from 1 following the numeration of the spreadsheet.
        :param list expected_headers: (optional) List of expected headers, they must be unique.

            .. note::

                returned dictionaries will contain all headers even if not included in this list
        :param value_render_option: (optional) Determines how values should
            be rendered in the output. See `ValueRenderOption`_ in
            the Sheets API.
        :type value_render_option: :class:`~gspread.utils.ValueRenderOption`
        :param str default_blank: (optional) Determines which value to use for
            blank cells, defaults to empty string.
        :param list numericise_ignore: (optional) List of ints of indices of
            the columns (starting at 1) to ignore numericising, special use
            of ['all'] to ignore numericising on all columns.
        :param bool allow_underscores_in_numeric_literals: (optional) Allow
            underscores in numeric literals, as introduced in PEP 515
        :param bool empty2zero: (optional) Determines whether empty cells are
            converted to zeros when numericised, defaults to False.


        Examples::

            # Sheet data:
            #      A    B    C
            #
            # 1    A1   B2   C3
            # 2    A6   B7   C8
            # 3    A11  B12  C13

            # Read all rows from the sheet
            >>> worksheet.get_all_records()
            [
                {"A1": "A6", "B2": "B7", "C3": "C8"},
                {"A1": "A11", "B2": "B12", "C3": "C13"}
            ]
        """
        entire_sheet = self.get(
            value_render_option=value_render_option,
            pad_values=True,
        )
        if entire_sheet == [[]]:
            # see test_get_all_records_with_all_values_blank
            #  we don't know the length of the sheet so we return []
            return []

        keys = entire_sheet[head - 1]
+        self.column_headers = keys
        values = entire_sheet[head:]

        if expected_headers is None:
            # all headers must be unique
            header_row_is_unique = len(keys) == len(set(keys))
            if not header_row_is_unique:
                raise GSpreadException(
                    "the header row in the worksheet is not unique, "
                    "try passing 'expected_headers' to get_all_records"
                )
        else:
            # all expected headers must be unique
            expected_headers_are_unique = len(expected_headers) == len(
                set(expected_headers)
            )
            if not expected_headers_are_unique:
                raise GSpread

I will add error handling, header validation, docs and testing, but this works.

lavigne958 · 2024-07-23T20:08:29Z

I am sorry but I disagree with you, we never know how the users will use the feature, some could be interested in inserting values starting at some specific rows. We can enforce the feature to only be able to append values. that's a design choice we need to think first.

muddi900 · 2024-07-24T09:32:33Z

I think it would better to have two sets of methods append_record(s) and insert_record(s). This would serve all use cases as I can see them. I'll add the commits to the PR.

lavigne958 · 2024-08-18T11:24:14Z

I think it would better to have two sets of methods append_record(s) and insert_record(s). This would serve all use cases as I can see them. I'll add the commits to the PR.

that's a very good idea. let's go for this option.

thp44 changed the title ~~Feature Request: Insert record~~ Feature Request: set_record + set_records May 2, 2019

alifeee added Feature Request good first issue labels Jun 29, 2024

lavigne958 added this to the 6.2.0 milestone Jul 5, 2024

muddi900 added a commit to muddi900/gspread that referenced this issue Jul 17, 2024

Added a column headers property and getters [Issue burnash#677]

4dca2b5

muddi900 linked a pull request Jul 22, 2024 that will close this issue

Added set_records and set_record to update sheet via records.[Issue #677] #1494

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: set_record + set_records #677

Feature Request: set_record + set_records #677

thp44 commented May 2, 2019 •

edited

Loading

muddi900 commented Jun 30, 2024

alifeee commented Jul 1, 2024

muddi900 commented Jul 1, 2024

alifeee commented Jul 2, 2024

lavigne958 commented Jul 3, 2024

alifeee commented Jul 5, 2024

lavigne958 commented Jul 5, 2024

muddi900 commented Jul 17, 2024

lavigne958 commented Jul 17, 2024

muddi900 commented Jul 19, 2024

lavigne958 commented Jul 23, 2024

muddi900 commented Jul 24, 2024

lavigne958 commented Aug 18, 2024

Feature Request: set_record + set_records #677

Feature Request: set_record + set_records #677

Comments

thp44 commented May 2, 2019 • edited Loading

muddi900 commented Jun 30, 2024

alifeee commented Jul 1, 2024

muddi900 commented Jul 1, 2024

alifeee commented Jul 2, 2024

lavigne958 commented Jul 3, 2024

alifeee commented Jul 5, 2024

lavigne958 commented Jul 5, 2024

muddi900 commented Jul 17, 2024

lavigne958 commented Jul 17, 2024

muddi900 commented Jul 19, 2024

lavigne958 commented Jul 23, 2024

muddi900 commented Jul 24, 2024

lavigne958 commented Aug 18, 2024

thp44 commented May 2, 2019 •

edited

Loading